The transformative power of computer software is everywhere, from the smartphone apps that connect the world to the laptop programs that simplify daily tasks at work and home to the software hidden inside physical objects like automobiles and pacemakers that is crucial for their safe operation. However, the growing ubiquity of software has a dark side: when it misbehaves, the consequences can now range from the merely annoying, like computer crashes, to the disastrous, like car crashes.
University of Pennsylvania computer scientists are joining in one of the National Science Foundation’s flagship research initiatives, known as “Expeditions in Computing: The Science of Deep Specification,” or DeepSpec. The goal is to develop a better, more comprehensive way of approaching software design that will help eliminate bugs and vulnerabilities before they endanger users or become targets for hackers.
Funded by a $10-million, five-year grant from the National Science Foundation, the researchers plan to develop integrated tools for working with “specifications,” which are mathematical descriptions of a computer program’s desired and allowed behaviors.
The expedition also aims to integrate new approaches to specifications with the way computer science is taught.
Computer science researchers have made striking progress in the fight against bugs in the past two decades, and building those findings into the training of the next generation of programmers and engineers could reshape how the tech industry ensures its products do what they are supposed to.
“Over the last twenty years,” said Pierce, the Henry Salvatori Professor in Penn Engineering, “people have developed very effective ways of checking software for bugs and vulnerabilities, but in order to do this we need good specifications of what the software is supposed to do. With DeepSpec, we aim to replace the existing ‘patch and pray’ mentality, where software developers wait for vulnerabilities to be discovered in the wild, often by malicious hackers, with a bold assertion: the technology for constructing software that can be mathematically proven to adhere to desired specifications has matured to the point that we can do it for critical systems in the real world.”
The ability to guarantee that each part of a program does what it is meant to do is increasingly important, as computer systems become more ubiquitous, densely interconnected and responsible for real-world outcomes. Comprehensive and accurate specifications are therefore necessary for preventing dangerous behaviors. For example, the systems that control a modern car’s accelerator and sound system might be linked, so that stepping on the gas also turns up the volume on the stereo. Part of the specification for these systems would be that the reverse, controlling the accelerator via the stereo, can’t occur.
Such requirements may seem like common sense, but the mind-boggling number of potential interactions can make them surprisingly difficult to implement. The complexity of computer systems and the distributed way they are programmed means that misinterpretations and mistakes can creep into software. A more systematic approach is necessary.
“The DeepSpec expedition,” Pierce said, “will create technology for working with specifications that are both precise, that is, integrated with the code they describe, and live, that is,continuously and automatically checked, and that can express rich descriptions of correct system behaviors.”
The members of the Penn contingent will contribute research on some of the most critical components of everyday computer systems, such as compilers and operating systems.
“If you’re using Apple software,” Zdancewic said, “you’re running machine code that was produced in large part by something called the LLVM compiler. What I’ll be doing in this project is designing a specification of the LLVM compiler and language semantics, which are a crucial foundation for proving correctness of software built using the LLVM”
“I’m focused more on next-generation languages,” Weirich said, “including one called Haskell, which has been designed for mathematical reasoning and the proof structure of specifications. This mathematical basis has made Haskell an impressive platform for development. Compared to more traditional languages, Haskell offers more compositionality and more abstraction. In other words, it is easier for programmers to build systems out of Lego-like pieces, that can be fitted together in many different ways, and easier for them to identify the commonality between different situations."
“Full machine-checked verification is important for critical software components,” Pierce said, “but it’s hard work and can increase the cost of developing software; the proofs that you write could be three, four or ten times as large as the software itself. My research is focused on lighter-weight ways of connecting the specification to the code before we even get to the proving stage.”
This involves a technique called property-based random testing, which uses large numbers of randomly generated inputs to check specifications against the behavior of a given piece of code. Better property-based random testing will speed up the full verification process for critical systems and may be sufficient by itself for less critical ones.
The researchers also aim to incorporate this philosophy of deep specifications into the way computer science is taught.
“Courses on compilers and operating systems are standard parts of every computer science education,” Weirich said. “We’d like to teach students how to build them better by getting them to think in terms of deep specifications.”