INVARIANTS play a central role in program development. Representative uses include refining a specification into a correct program, statically verifying properties such as type declarations, and runtime checking of invariants encoded as assert statements. Invariants play an equally critical role in software evolution. In particular, invariants can protect a programmer from making changes that inadvertently violate assumptions upon which the program's correct behavior depends. The near absence of explicit invariants in existing programs makes it all too easy for programmers to introduce errors while making changes.
An alternative to expecting programmers to annotate code with invariants is to automatically infer invariants. This research focuses on the dynamic discovery of invariants: The technique is to execute a program on a collection of inputs and infer invariants from captured variable traces. Fig. 1 shows the architecture of the Daikon invariant detector. As with other dynamic approaches, such as testing and profiling, the accuracy of the inferred invariants depends in part on the quality and completeness of the test cases; additional test cases might provide new data from which more accurate invariants can be inferred. The inference of invariants from program traces and its application to software evolution raises a number of technical questions. How can invariants be detected? Can the inference process be made fast enough? What kind of test suite is required to infer meaningful invariants? What techniques can be used to minimize irrelevant invariants that are unlikely to aid a programmer in the task at hand?
How can the required information be extracted from program runs? Can programmers productively use the inferred invariants in software evolution? This article provides partial answers to these questions in the form of three results stemming from our initial experiences with this approach.
The first result is a set of techniques for discovering invariants from execution traces and a prototype invariant detector, Daikon, that implements these techniques. Invariants are detected from program executions by instrumenting the source program to trace the variables of interest, running the instrumented program over a set of test cases, and inferring invariants over both the instrumented variables and over derived variables that are not manifest in the original program. The essential idea is to test a set of possible invariants against the values captured from the instrumented variables; those invariants that are tested to a sufficient degree without falsification are reported to the programmer. Section 3 discusses the invariant detection engine; the discussion of instrumentation is deferred to Section 8. The second result is the application of Daikon to two sets of target programs. The first set of programs appear in The Science of Programming .
These programs were derived from formal preconditions, postconditions, and loop invariants. Given runs of the program over randomly-generated inputs, Daikon discovers those same program properties, plus some additional ones (we introduce this result as motivation in Section 2).
This first experiment demonstrates that dynamic invariant detection produces invariants that are accurate. The second set of programs–C programs, originally from Siemens  and modified by Rothermel and Harrold –is not annotated with invariants, nor is there any indication that invariants were used explicitly in their construction. Section 4 shows how numeric invariants dynamically inferred from one of these programs assisted in understanding and changing it. This scenario also shows that dynamic invariant discovery is complementary to static techniques (which examine the program text but do not run the program).
This second experiment demonstrates that dynamic invariant detection produces invariants that are useful.
The third result is a quantitative analysis of scalability issues. The analysis demonstrates that inference running time is linearly correlated to the number of program points being traced, the square of the number of variables in scope at a program point, and the size of the test suite. Thus, choices of program points and variables over which to detect invariants can control invariant detection time. While there are many potential invariants, most of them are quickly falsified, contributing little to overall runtime. Experiments on test suite selection suggest that the set of invariants inferred tends to stabilize with growing test suite size, reducing the need for large test suites and, thus, limiting inference time. Section 6 correlates the number of invariants with program correctness.
Balamurugan Balusamy is currently working as a lecturer in school of computing sciences in vellore institute of technology(vit), Vellore, India.