Ambivalent about risk?
I read an article by Mike Murphy in 2018 about a data breach that probably involved more than 230 million consumers. Fortunately, no social security numbers were exposed, but a lot of personal data was posted on the dark web.
In 2019, an acquaintance notified me about a huge data breach. After reading an article he recommended, I was less complacent, but not much.
The article by Angela Moscaritolo, written in 2019, read in part,
Did you receive an email this morning informing you that your personal information was exposed in a data breach called Collection #1? You're not alone, and it's a reminder to take precautions like enabling two-factor authentication and signing up for a password manager.
Security researcher Troy Hunt, who runs breach notification site Have I Been Pwned (HIBP), first reported the Collection #1 exposure. The massive trove of leaked data, which was posted to a hacking forum, includes some 772,904,991 unique email addresses and 21,222,975 unique passwords, Hunt said.
Based on this article, it seemed reasonable to check out the site my friend's employer had recommended to see if any of my family accounts had been hacked. I discovered that of the four email accounts I had, two were hacked. This was not surprising since the two accounts that had been breached were ones that were constantly used.
After learning of these data breaches, I tended to ignore personal data breaches, not because they were not important but because they seemed inevitable. (I did buy some software that I hoped would protect me.) Even the 2024 data breach, which exposed more than 2 billion records, as reported by Bloomberg, did not affect my insecurity level.
No black swan.
After reading the article citing a multi-billion account breach, I wondered if this was one of the catastrophic events we had all been expecting. Surprisingly, it did not seem catastrophic or even major to me. This certainly was not a black swan event.
As a Ph.D. student, I did all the programming for my dissertation. The program, between 1,500 and 2,000 lines of Fortran code, was minuscule by today's standards. Today, Microsoft's Windows system probably requires 50 million lines of code (interestingly, Microsoft Basic supposedly had slightly less than 7,000 lines of code in 1978), and Google's system probably requires billions of lines of code. (There are disagreements on the number of lines of code used by Microsoft or Google.)
Let's focus only on the assumption that Microsoft has 50 million lines of code. Assume that the probability that a line of code has an error is .000001, or one in a million. If this probability is correct, there are 50 lines of Microsoft code with an error. (For Google I have assumed 2 billion lines of code, and the comparable error number would be 2,000.) Microsoft and Google may want to challenge this error rate; my example probability number could be too high. But, having used Microsoft for years and having had the program shut down more than once because of an error, I am willing to say that there are at least a few lines of code with errors.
Imagine the damage that could be done if there were a major coding error in Microsoft's, Apple's, or Google's software. CrowdStrike's problems illustrate what can go wrong. (I am not taking sides in the Delta, Microsoft, CrowdStrike debate. I leave that to the courts.)
Those who favor artificial intelligence (AI) will quickly come forward and say that errors would not occur if we removed humans from the equation, but even AI has the potential to produce errors. As the supply of code grows, the potential impact of a problem will become more pronounced. (Eventually, there will be tens of billions of lines of code. Think of the possible impact if only one key line of coding is flawed; the programming dominoes could fall.)
In The Black Swan: The Impact of the Highly Improbable, Nassim Taleb states that a black swan event has these three attributes:
First, it is an outlier, as it lies outside the realm of regular expectations, because nothing in the past can convincingly point to its possibility. Second, it carries an extreme impact…. Third, in spite of its outlier status, human nature makes us concoct explanations for its occurrence after the fact, making it explainable and predictable.
I do not believe a catastrophic coding failure necessarily would be a black swan event. The term "black swan" may not be sufficiently severe.
Dark matter.
I recommend the use of the term "dark matter" event. Physicists claim that "dark matter" makes up the bulk of the matter in the universe, even though they have not been able to observe it. Since "dark matter" may have an interlocking relationship with early black holes and since it probably makes up a majority of the matter in the universe, it seems that it is a better metaphor for a coding event that could destroy our societies. Why do I say this? If a coding error shuts down our power grid chaos would ensue. Society could fall apart in hours, days at most. Personally, I think we need to have two terms for catastrophic events: "black swan" and "dark matter." The difference is that we would survive a black swan event. It has its role in risk management. However, there needs to be a higher level of risk categorization.
Business leaders do not have to worry about a "dark matter" event. They need to run their operations like there will never be one. If a dark matter event occurs, no one will be left to criticize the executives. There will be no one to blame them for not having a plan to deal with the results or explain why the event should have been anticipated.
In defense of organizations with large, important programs, there probably are adequate safeguards that control errant lines of code. The safeguards keep errant code from doing significant permanent damage or being manipulated by hackers. Maybe I am overreacting. Maybe we do not need the term "dark matter." Maybe we are all very safe and have no need for concern. I am sure that is the case.
If you have not read the blog that describes the "However View," click here.
The picture is from Pixabay.