So as we all know on the news, the cybersecurity firm Crowdstrike Y2K'd it's own end customers with a shoddy non-tested update.
But how does this happen? Aren't there programming teams and check their code or pass it to a quality assurance staff to see if it bricked their own machines?
8.5 Million machines too, does that effect home users too or is it only for windows machines that have this endpoint agent installed?
Lastly, why would large firms and government institutions such as railway networks and hospitals put all their eggs in one basket? Surely chucking everything into "The Cloud (Literally just another man's tinbox)" would be disastrous?
TLDR - Confused how this titanic tits up could happen and that 8.5 Million windows machines (POS, Desktops and servers) just packed up.
It's destined to happen, according to Normal Accident Theory.
Yes, there are probably a gigantic number of tests, reviews, validation processes, checkpoints, sign-offs, approvals, and release processes. The dizzying number of technical components and byzantine web of organizational processes was probably a major factor in how this came to pass.
Their solution will surely be to add more stage-gates, roles, teams, and processes.
As Tim Harford puts it at the end of this episode about "normal accidents"... "I'm not sure Galileo would agree."