Software that always works (part 1)

This is part one of a long series. This is a preamble explaining the background.

Many of the advances that humans have made in the past 50 years have been made possible through the use of computer programs, or software. Many of the problems we’ve faced in the past 50 years have been through software that fails to work as intended. I think the focus in remedying that has been too narrow. We’ve been focusing on writing program without bugs, or writing programs that meet all the specifications. This is both impossible and setting our goals too low. We will never be able to write large programs without bugs, and even if we could write programs that correctly met all the specifications, we are unable to actually list all the specifications out in advance. And even if we could do that, hardware fails in the real world, and that should not be an excuse for software to fail to function. Our actual goal is to write software that accomplishes its tasks regardless of any internal or external problems.

This is a very hard task. But it’s the actual meaningful task.

Imagine that you hired someone to run your accounting department. You give them the task of making payroll happen. Let’s say you go to this person four weeks later, to find out that nothing has happened for the past two payroll periods, and the response was “the printer ran out of paper, so I couldn’t mail out checks.” You would not think “man, I failed, I should have made sure paper was available”. You’d fire that person. Or let’s say that every few hours this person stops working and calls you because something is blocking the work: “the N key on the keyboard doesn’t work”, “the cleaning person turned off the power to my computer”, “I typed the wrong name into the system and I need you to fix it”, “IT wants to upgrade all our systems to Windows Insanity and that will take 3 weeks”, and so on. You’d also fire that person.

And yet we treat our software like the above. If something happens outside the domain, we congratulate ourselves by saying: “look, we caught this exceptional case of failing to read from a file, we clearly printed an error message and exited.” Really? That’s what the user of the software wanted, an error message? No, directly or indirectly, the user needed the information from that file, and while your error handling probably prevented greater harm, at best you can say your program made the user less unhappy.

Let’s phrase it a little more rigorously, but not too much so. We have a goal G we want accomplished. We have a program P to get us to that goal. And we have an environment E that, alas, we don’t get to control.

G = P(E)

That looks simple. And if the environment were completely known to us, then creating program P is a math problem. It might be a very hard math problem, but it is a math problem.

However, the difference between theory and practice is that, in theory, theory and practice are the same thing. We don’t get to specify our environment. I want to stop using the word control because, to some extent, we can affect the environment, we can provide inputs to the environment, we just can’t determine the environment. Here’s our challenge – if we supply a different environment to our program than we predicted, we will get a different result. Note that I’m only talking about E containing the parts of the universe that are relevant to the operation of P.

G’ = P(E’)

If G = G’, then we’re good, and this could happen either through luck, or through conscious effort on our part to write multiple programs that each handle a different environment, then concatenate them together into a final program.

Of course, we are pretty finite individuals, and our ability to write programs for each possible environment is limited, much less our ability to predict possible environments. In point of fact, we can’t. Our ability to predict relates to our ability use models to extrapolate, and we don’t have all the models yet, and probably never will.

There has been some progress recently through the emulation of mechanisms we see in nature – through evolution, organisms have found very clever ways to handle an unpredictable world. However, this process is very slow, and involves lots and lots of individual organisms failing. Since we can’t make a credible simulation of the world (that modeling issue again), our software agents need to learn in the real world, and that can be very expensive for many kinds of problems. Imagine rocket software that learned through trying different things to see what happened; I don’t think we’d be happy with the outcome.

Intelligence would also help greatly in this, but that’s also a little (or a lot) out of our reach.

So the question is, what can we do to write non-sentient software that can still accomplish the goal even when the environment is stacked against it? I’ll explore some ideas in the next article in this series. We’re going to start very small but with something meaningful – file I/O – and see if we can apply a technique I call “programming with expectations”.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>