Design by Contract 契约式设计
A Conversation with Bertrand Meyer, Part II Bertrand Meyer访谈之二
by Bill Venners 采访者：Bill Venners
December 8, 2003 时间：2003-12-8
Bertrand Meyer talks with Bill Venners about Design by Contract and the limits of formal languages for expressing contracts.
Bertrand Meyer和Bill Venners讨论契约式设计和形式语言在契约表述方面的局限性。
Bertrand Meyer is a software pioneer whose activities have spanned both the academic and business worlds. He is currently the Chair of Software Engineering at ETH, the Swiss Institute of Technology. He is the author of numerous papers and many books, including the classic Object-Oriented Software Construction (Prentice Hall, 1994, 2000). In 1985, he founded Interactive Software Engineering, Inc., now called Eiffel Software, Inc., a company which offers Eiffel-based software tools, training, and consulting.
On September 28, 2003, Bill Venners conducted a phone interview with Bertrand Meyer. In this interview, which will be published in multiple installments on Artima.com, Meyer gives insights into many software-related topics, including quality, complexity, design by contract, and test-driven development.
2003年9月28日，Bill Venners对Bertrand Meyer做了电话访谈。在这次访谈（内容将在Artima.com分多次公布）中，Meyer对许多有关软件的问题作了鞭辟入里的论述，如软件质量、软件复杂性、契约式设计以及测试驱动型开发等。
Designing Components that Fit Together
Bill Venners: In your book, Object-Oriented Software Construction, you write, "For a system of any significant size, the individual quality of the various elements involved is not enough. What will count most is the guarantee that for every interaction between two elements there is an explicit roster of mutual obligation and benefits, the contract." What is the contract? Why is it important?
Bertrand Meyer: There are two quite separate points in that extract. The first point is that when you're building a library, it's not enough to just accumulate good components. Take a data structure library as an example. You might have excellent classes for lists, stacks, files, and btrees, but taken together they don't make an excellent library if they are inconsistent. If they use different conventions, they aren't part of a single design. For example, when you're putting an element into an array, you might have an insert operation that takes x and i, where x is the element and i is the index. For the hash table class you might have an insert operation that takes key and x, where key is the key and x is the element. The order of arguments is reversed. The order of arguments might make perfect sense within each class, but when you start approaching the library as a whole, you're in new territory each time you look at a new class. You don't get a feeling of consistency. Instead you get a feeling of a mess—something that is a collection of pieces rather than a real engineering design. What we found many years ago when we started focusing seriously on libraries is that just as much attention has to be devoted to the construction of the library as a whole as to the construction of the individual elements. That's one point.
The second point pertains to how a library needs to be consistent, how its various elements can fit together like the pieces of a puzzle. It's not just a matter of defining consistent interfaces, order of arguments, things like that. It's also a matter of defining precisely how the various elements are going to communicate with each other and making sure that the conditions of this communication is very precisely defined. This is, of course, where the techniques of Design by Contract come in.
Design by Contract
Bill Venners: Could you give an overview of Design by Contract?
Bertrand Meyer: People who have heard a little bit about contracts often think they are simply a way to put the equivalent of assertions in a program: that is to say, to put in a few checks here and there for conditions that are expected to hold at specific points of execution. Such assertions are basically debugging statements. The assertion code is executed during debugging, and if one of the conditions is found not to hold, execution usually stops with a message. That's one of the things you can do with contracts, but it is only a small subset of the purpose of contracts.
The main purpose of contracts is to help us build better software by organizing the communication between software elements through specifying, as precisely as possible, the mutual obligations and benefits that are involved in those communications. The specifications are called contracts. The underlying observation, which is not particularly deep, is that a software system is made of a number of elements that cooperate with each other. In an object-oriented architecture, these elements might be classes and methods, which I prefer to call routines. The elements might be something else in a different programming model, but let's just assume we are working in an object-oriented context. So we have a software system made of a number of classes, and these classes themselves contain among other things routines that are going to be executed and call each other. So both the architecture of the system and its execution rely on a large set of possible communication channels between the software elements. The metaphor of contracts is used to guarantee that these communications occur not on the basis of vague expectations of services rendered, but on the basis of precise specifications of what these services are going to be.
Typically in such a communication between program elements—for example, when one routine calls another routine—it's like a client/supplier relationship. The calling routine, which we will call the client, is performing some computation, probably for the need of its own client. In order to perform that computation, that is to say, to perform its own service, it needs services from some other software element, typically some other routine. So there is a client/supplier relationship, where the client needs a certain service and the supplier provides that service. And in the basic scheme both the client and the supplier are routines.
For the software and software designer to be able to guarantee any kind of correctness and robustness properties, they must know the precise constraints over such communications. This is where the contract metaphor from business applies to software. Say I'm contracting from you to have a certain business operation performed on my behalf . For example, perhaps you have a part that I can use for a product that I'm manufacturing. So I'm your client, and you're my supplier. In business, we're going to organize our collaboration on the basis of a contract: a precise statement of the mutual obligations and benefits that we'll expect. In software, we're going to do exactly the same thing when we write client and supplier routines.
So we're going, for example, to impose on the client certain obligations as to the kind of original program state that is permissible when the client calls the supplier or the kind of arguments that the client routine passes to the supplier. These are pre-conditions, and they're obligations for the client. In the other direction, we are going to express the conditions that the supplier routine must guarantee to the client, on completion of the supplier's task. That's the post-condition of the contract, specifically, the post-condition of that particular routine. And of course, the post-condition is an obligation for the supplier.
Pre- and post-conditions are symmetric. The pre-condition of the routine is an obligation for the client, because the pre-condition says that before calling a particular routine, the client must satisfy a particular property. For the supplier, the pre-condition is a benefit, because it facilitates the supplier's job by restricting the set of cases that the supplier has to handle. The obligation of the client is a benefit for the supplier. The post-condition is of course an obligation for the supplier, because it describes the job that the supplier has to guarantee to perform. But the post-condition is clearly a benefit for the client, because it describes the result that the client is entitled to expect from the execution of the routine for the benefit of its own problem set.
Before moving on I can describe a very simple example of pre- and post-conditions: a routine that computes the square root of a real number. The pre-condition would say that the real number has to be non-negative, because otherwise the result would be undefined. The post-condition may say that the result returned by the routine is the approximation, within a specified precision, of the exact mathematical square root of the routine.
So pre- and post conditions are two fundamental elements of contracts. The third fundamental element—which is really useful mostly in an object-oriented context, where we have not only routines but at the higher level, classes—is invariants. A class invariant is a condition that applies to an entire class. It describes a consistency property that every instance of the class must satisfy whenever it's observable from the outside. That means that this property, the class invariant, must be satisfied whenever an instance of the class is created. And every exported routine, routines that can be called from the outside of the class, must preserve it. That is to say, assuming the class invariant was satisfied before the routine was called, the routine must ensure that class invariant is again going to be satisfied on exit. It's as if the class invariant is added to the pre- and post-condition of every single exported routine of the class. But more fundamentally, a class invariant is a way to characterize the fundamental consistency and integrity properties of the class and its instances.
For example, if you have a bank account class with a field that represents the current balance and a field that contains a list of all deposits and withdrawals since the opening of the account, then you would have a class invariant that states that the value of the balance field is equal to the total of all the deposit values so far minus the total of all the withdrawal values so far. It's typical that class invariants define properties that indicate how the various constituents of the class, such as balance and deposit/withdrawal lists, maintain consistency with each other.
Pre- and post-conditions and class invariants are not the only elements of contracts. There are also loop invariants and a few others, but pre- and post-conditions and class invariants are really the basic fabric of contracts. In my experience, relying on these notions—that is to say, making sure when you write software that don't just write the implementation, but also write the more abstract properties underlying the implementation in the form of contracts—provides a greatly added software development experience in several respects. It helps ensure correctness in the first place, helps debugging, helps testing, helps ensure inheritance is properly handled, helps managers, provides a quite effective form of documentation, and a few others.
The Limits of Formal Languages
Bill Venners: Are there parts of contracts that are difficult or impossible to express in a formal language, that you can only express in a human language?
Bertrand Meyer: The simple answer is yes. There are really three reasons why it sometimes feels hard to express a contract in terms of a programming language. As a background, the way contracts are expressed in Eiffel is very simple: they are just Boolean expressions. Boolean expressions are intended precisely to express runtime true or false properties—properties that at any point in the execution may hold or not hold. That's exactly how Boolean expressions are used, for example, in an if-then-else construction. That's also how they are used to express contracts in Eiffel, with one major extension: the old keyword. The old keyword makes it possible to express in a post-condition that a certain property at the completion of a routine is relative to a certain property on entry to the routine. If you have, for example, a routine that adds a certain amount of money to a bank account, obviously the post-condition of this routine is going to have to express that the new balance is related to the old balance. The new balance must be the balance before the execution of the routine plus some amount. The only way to express this properly is to have some notation to refer to the original value of the balance. So with this major, but single, extension, the language of contracts in Eiffel is just the language of Boolean expressions.
There are three occasions in which this approach of using Boolean expressions may appear restrictive. One occasion that is justified—where you will have to resort to human language descriptions instead of formal ones—is when you have externally visible properties. For example, if you have in a graphical library an operation that changes the color of a certain pixel on the screen to red, you'd like to describe a post-condition that says the resulting color is red. Maybe to some extent you can do that, but what you really want to express is that if a person is sitting at a terminal, and the person is not color blind, he or she will accept that the new color is red. Clearly this kind of assertion cannot be expressed in a formal language, because it refers to properties outside of the realm of the software program. In practice it may not be such a big deal, because what you usually want in practice is not necessarily the guarantee that someone accepts the color as red, but the guarantee that the RGB value of the pixel being displayed is within a certain range. And of course that can be expressed as a purely Boolean property.
The second occasion where Boolean expressions may seem restrictive is one that scares most people who have looked at the issue from a theoretical perspective: the language of Boolean expressions is relatively limited and doesn't have first-order predicate calculus. This has led some people who design the specification mechanisms for programming languages, or in the case of UML, for modeling languages, to include facilities from first-order predicate calculus as a language extension. I think that is largely a mistake. At least we don't need to do this in Eiffel, because we have a high-level mechanism known as agents to describe essentially high-level functional operations on objects. So anything that you may want to express on a complex object that would seem to require first-order predicate calculus can be expressed actually quite nicely within the confines of the programming language.
What is quite easy to express with Boolean expressions, for example, is a property that if you're adding an element at the end of a list, the last element of the list is now the one that you just added. That's quite easy to express, and it's definitely expressed in the corresponding Eiffel concept. What is less easy to express is the property that all the other elements of the list, the ones that you haven't touched, are still there, in the same order, and are equal to their previous values. This kind of assertion seems to require special language extensions and has led people to suggest introducing first-order predicate calculus. But what we've found is that apart from the agents mechanisms, which has been in Eiffel now for a number of years, there's really no need for introducing first-order predicate calculus, or there exists operators. We can basically express those assertions in the realm of the programming language.
Also, first-order predicate calculus isn't good enough anyway. A typical kind of property that you might want to express, say, in a class invariant for a graph class, is that the graph has no cycles. This is not expressible as a first-order predicate calculus property. On the other hand, it is expressible as a Boolean expression if you accept including function calls in the Boolean expression. So that particular argument for extending the assertion language or for expressing contracts in English goes away at least in the long term. In the short term, these techniques of expressing high-level ambitious properties are still under development, and you will still find for that kind of property assertions expressed informally as comments.
The third occasion where defining contracts using Boolean expressions raises issues is non-functional contracts. Maybe I shouldn't go into this too much, because this is more of a research issue. Nevertheless, once you have completely expressed all the functional contracts—the contracts that specify what the input may be to operations, what outputs are legitimate, and what globally consistency constraints must be maintained—you might still want to express as part of the specification things like: this operation will always execute in less than half a millisecond, or this particular component will never use more than 300 kilobytes. These are performance contracts, and it's a quite interesting research area. I think there will be usable results in this area in the next few years, but it falls beyond the realm of Design by Contract as it is generally understood in practice today.
Bill Venners: That was actually my next question: is performance part of the contract? It does seem that performance is important in the contract sense. People want quality of service guarantees.
Bertrand Meyer: Right. Absolutely. It's very important. I think everyone realizes this. The problem is that it is pretty difficult to ensure and test. And it also has to be defined properly. When you define, for example, a response time, is it a minimum response time? Is it a maximum response time? Is it an average response time? If so, what is the statistical distribution? I don't think these questions have no possible answers, but they make the issue quite delicate. And they have no direct counterpart with functional contracts. So performance contracts are definitely an area in which more work is needed and actually is being performed.