15 Mar 2004
hilf demonstrated how equations as jotted down by einstein in 1905 would be almost incomprehensible to modern scientists today. over the years, verbose notations have been replaced by increasingly more succinct ones, new symbols have been introduced. i immediately had to think of leaky abstractions. hilf was adamant that physics was not prone to those problems because it is grounded in solid math.
good for them physicists, and too bad computer science cannot claim the same currently.
Hilf may have misunderstood what leaky abstractions are really about. Had he understood, he probably would have seen that physics and other natural sciences have the exact same problem, and that the mathematical rigor he claims is at best equivalent to the formal definition of computer programs and therefore not even relevant to the problem of leaky abstractions.
First, let me explain the problem of leaky abstractions. As originally explained by Joel Spolsky, leaky abstractions are a challenge to software engineers. Much like mathematics and theoretical science, new achievements in software development build on the foundations already in place. These foundations are abstractions that package up the complexity of other tasks. For example, if you are building a program to download a file over a network, you can use the web protocol, HTTP. Then you can choose a program to serve the file from a number of existing applications, and instead of writing the code to connect to the server, follow the rules of the protocol, and write the file to disk yourself, you can simply invoke an existing piece of software that does this. In highbrow engineering circles, this is called reuse and is highly desirable because it saves development time and avoids creating new code that must be debugged and maintained. It also helps to cement existing standards so that software makers can compete on the basis of innovative features rather than "we crash less".
An abstraction becomes leaky when some of the details it claims to handle leak through and become your problem. Continuing the file-fetching example, what happens if the network is down? You depend on some piece of fetch software to get files, which depends on a network protocol to ensure that two computers can communicate reliably, which depends on a network to allow computers to fling bits at each other. If the network can't handle that job right now, it can tell the network protocol. But the protocol can't do anything about a network that's physically disconnected, so it shrugs. The fetch software you invoked can't do anything about a protocol that won't let it connect, so it shrugs. Your program depends entirely on this piece of software, so you shrug. A leak in the bottommost layer of abstractions has sprung through every other layer, and has to be dealt with outside the realm of the automatic. "Plug in your network cable," your computer says. Do you ever get that message when your cable is still plugged in, but your cat has stepped on a power strip and turned off your network hub? Another leak!
You may already see that leaky abstractions can show up outside of computer science. Do business transactions always go as they should? Have you ever come to a restaurant expecting to get a meal, only to find that they couldn't seat your group? Have you tried to drive home from work in the usual 30 minutes only to find that weather or a car crash dragged that out to 2 hours?
Of course, these are all informal abstractions. In Physics, the abstractions are all mathematically defined. A more rigorous abstraction of driving home from work wouldn't leave any room for leaks. Right?
Well—not exactly wrong nor right. It depends how you look at it. If you're developing the theory alone, you're not going to find that suddenly F = m*a doesn't hold up because e.g. it isn't defined for a = 3 m/s/s; the requirement that definitions be rigorous prevents that. But if you're trying to develop a theory that accurately describes the interaction of actual physical objects, the classical Newtonian abstraction above breaks down at certain points, like when mass is really really small or you're moving really really fast. (More knowledgable readers are welcome to correct/improve my Physics.)
What we see here are two ways to judge the rigor of an abstraction, which I'll call theoretical rigor and applicative rigor. The mathematical foundations of physics ensure its theoretical rigor, but when applied to the description of nature, we can find failures in applicative rigor. Newton's models, though we call them laws, do not accurately describe everything they were once claimed to describe. And applying these laws to real life situations requires accounting for a number of other factors—wind resistance and its ilk. We could qualify the law by describing the highly idealized world it assumes, but that would take too long. We'll settle for expecting the laws of physics to describe limited, idealized versions of what actually happens in real life.
Now back to computing. Most programming languages require that you write well-defined programs—you can't leave out a step and expect the computer to ask you what to do when it gets there. The language usually provides a sensible default, like doing nothing, but this is a way to compress the notation, not to escape rigorous definition. So programming languages, at least those that have an actual deterministic implementation on a computer, actually enforce the constraints of theoretical rigor at least as well as the Physics research community.
But when we take those theoretical tools and apply them to solve problems, we find many leaky abstractions: broken networks flummox our web browsers; buggy data compression libraries leave security holes open in our servers. Each of these bugs is like a wind resistance we hadn't thought of. We hackers had been assuming a simpler world, and so the model of the world we coded for doesn't exactly correspond with the world we're selling software too.
But that's OK: it happens to Physicists too.
For more information:
- The Curry-Howard Isomorphism describes a relationship between logical proofs/proof systems and computer programs/programming languages. It is described in this Wikipedia entry and at length in this online book.
- I also made a comment in a nerdy discussion on programming language design and the isomorphism at the About Kim weblog.
- Gerald Weinberg's An Introduction to General Systems Thinking teaches you how to think scientifically, including how to respect the limitations of abstraction.
- Joel Spolsky's original essay, The Law of Leaky Abstractions, is a good read and isn't too long.