The System Is Down

This is the first time in years I’ve experienced a ‘BSOD’ (Blue Screen of Death). I usually dismiss it as Something I Did, but I quickly realised how non-alone I was today. Much wailing, much gnashing of teeth. Dude. What.

Yep – this is one of the big ones.

So what does it all mean?

Today’s outage appears to have been caused by software from the company CrowdStrike, which many major organisations use to secure their fleets of Windows 10 (i.e. Microsoft) computers.

Security software usually needs incredibly special ‘low-level’ access to computer systems to properly protect them, so when the company issued a faulty update to the software, over the internet, automatically, it didn’t just break an application or two, it caused whole systems to crash and reboot. In some cases, the computers restarted, loaded the faulty update, and crashed again, which is why some systems are still down – it requires a human to intervene and stop the reboot loop.

It’s not an easy fix; when the humans are very distant from the physical boxes that need a swift kick. Big problems.

Is it a hack, cyber-attack or a global conspiracy?

Nah, it’s a glitch that happens all the time; but usually the effect is just an ‘unexpected error in line 10‘. Human error, without a nefarious purpose. (If it were a deliberate act, it would have been much more silent and deadly.)

Unfortunately, this is a software glitch that props up a lot of critical computers, so the real issue is the pervasive and critical nature of the software component which has broken – something called ‘csagent.sys’ is the culprit.

Obligatory xkcd link: https://www.xkcd.com/2347/
Substitute ‘Random Person’ with ‘Leading Security Company’

The chain of events makes it unclear whether Microsoft or CrowdStrike messed up. Microsoft was reporting issues in the morning (9:30am Aus Eastern) well before the CrowdStrike component started triggering system crashes in the afternoon (around 3pm Aus Eastern).

Observe the market’s response: the market will know – well before we will – where the blame lies.

Not Apple?

Nope. Not Apple.

Not Optus?

Nope, not Optus. Not Telstra either.

What can I do about it?

If your machine is still working, don’t worry, it’ll get sorted automatically, using the same automatic patching process that caused the problem in the first place.

If it’s not – don’t be tempted to try any of the fixes you see online or uninstalling the software; that might open you up to further problems down the track – opening windows that hackers can crawl into once the dust has settled. And don’t trust anyone jumping out of the shadows to offer you help; scammers LOVE the uncertainty created by upheavals like this and will try to confuse people into their traps.

Get in touch with your IT department or local expert, and expect a little wait.