The cynics say software is “eating the world.” In a networked universe, code is operating more and more critical systems, reducing the need for human intervention.

Whatever efficiency this brings, there are accompanying failures.  Several incidents across the globe made this very clear last year. United Airlines was forced to ground its fleet because of an issue with its departure-management system; the New York Stock Exchange suspended trading after an upgrade; the front page of The Wall Street Journal’s website crashed; and Seattle’s 911 system went down when a router failed. With so many critical systems failing simultaneously, many assumed it might be a coordinated cyber attack. Turns out none of these events were actually related.

Before we explore these issues further, please note that the way we think about engineering failures was developed shortly after World War II, before the advent of software for electromechanical systems. The idea was that you make something reliable by making its parts reliable.

So why are critical systems failing so often these days? Well, according to Nancy Leveson, professor of Aeronautics and Astronautics at MIT, when we had electromechanical systems, they were tested exhaustively. Engineers would anticipate all the things the system could do as well as the various states it could go into. 

Software, however, differs significantly. By simply editing a software file, one can easily change a programme. As a result, software changes frequently to facilitate innovation. If you stop and think about it, systems such as mobile phones or video games are changed constantly, but are released with a lot of bugs or errors (remember the Samsung Note 7 bomb scare?). This is not due to a lack of testing, but rather to a lack of understanding of software.

As we all know, software doesn’t break. In fact, software mostly does exactly what it is told to do, and it does so flawlessly. What causes system failure is that the software was told to do the wrong thing.

That is not to say programmers are purposely doing this. But their lack of imagination and understanding of software can lead to the failure of critical systems. This will continue to happen if our ways of programming do not change.

The way we write software these days is far from efficient. We often design it without a clear understanding of the issue at hand or without extensive testing. Perhaps software developers should focus on developing tools for designing software.  “The problem is that we are attempting to build systems that are beyond our ability to intellectually manage,” says Professor Leveson. Unlike hardware, software can be changed cheaply. Anyone can edit a single line of code of some programme and it becomes something completely different. This flexibility in software is both a miracle and a curse. While you can create amazing software systems, the complexity and number of lines of code grow as various programmers edit a single piece of software.

“When your tires are flat, you look at your tires, they are flat. When your software is broken, you look at your software, you see nothing.”

That right there is the problem!