Xisa is a static program analysis that automatically computes invariants about the memory state at each point in a given program. The abstract memory descriptions computed by Xisa are sufficiently precise for analyzing programs with recursive data structures, such as dynamically-allocated, pointer-based lists and trees. Yet, they are compact because they are based on program developer-supplied specifications that focus them to properties of interest.
What does an abstract memory state look like?
Xisa can present program invariants about the memory state in the form of graphical diagrams roughly comparable to those that would appear on a whiteboard or in a textbook. For example, the following shows a graph describing a memory state in an example that traverses a doubly-linked list:
Underneath, Xisa's memory abstraction is built around inductive predicates that are used to summarize a potentially infinite set of possible memory configurations. The user-supplied specifications essentially provide a base set of inductive predicates for describing a memory state.
More formally, we view an abstract memory state as a logical formula (expressed in separation logic) that can be visualized as a graphical diagram. Each node in the graph denotes a value, such as a memory address or null. Some nodes correspond to program variables (e.g., list, curr), while the numbered nodes correspond to unnamed heap values (i.e., existentially-quantified variables). A thin edge represents a single memory cell where the source node is the address of the cell and the destination node is the contents (i.e., a points-to relation). A thick edge summarizes potentially zero or more memory cells. The edge labels specify how those cells can be arranged (which will be explained in further detail in the next section). In the above example, the thick edges represent regions of doubly-linked list nodes. An important additional constraint on the graph is that each edge corresponds to a disjoint memory region.
To make the graphs easier to read, we do not show “terminal points-to edges”, that is, points-to edges where the destination node has no outgoing edges and no other incoming edges. For example, the non-pointer, data fields of 6 are not shown.
How is Xisa extensible?
The specifications we expect are analogous to data structure validation code that a developer might use for testing or dynamic analysis. For example, consider a program that uses a doubly-linked list. To test that operations preserve the doubly-linked invariant (i.e., for an element e, e.next.prev = e), we can write the following doubly-linked list validation function dllist (written in an untyped, pseudo-object-oriented style):
root.dllist(prev_elem) =
if (root = null) then
true
else
root.prev = prev_elem and
root.next.dllist(root)
This recursive function walks the doubly-linked list along the next field starting at root and checks that the prev field points to prev_elem. We call such a specification, a checker definition, and we believe that these specifications are fairly natural (which is critical to usability), yet a flexible and powerful way to parameterize a static shape analysis.
Checker definitions are global and provided on a per-program basis. They essentially customize the abstractions used by Xisa for the program being analyzed.
Returning to example graph in the previous section, a thick edge with no destination node indicates that the specified checker holds starting at the source node with the given arguments (i.e., the specified inductive predicate holds)—intuitively, the edge can be thought as a computation tree of a successful checker run or the derivation of the inductive predicate. Informally, a thick edge with a destination node says that the specified checker holds on a segment from the source node to the destination node; the label at the destination node gives what remains (see the papers for further details). Thus, in more detail, the example graph says that there is a doubly-linked list segment starting from 1 to 6, 6's next field points to 11, and 11 is a doubly-linked list. Furthermore, the pointer variable list points to the beginning (to 1) and curr points in the middle (to 6). Note that the label dllist(6) on the checker edge from 11 tells us that 11's prev field must point to 6 (if it is non-null) based on the definition of dllist given above.
Are there examples?
Below are a number of examples of applying Xisa to code that manipulates various kinds of lists and trees. The analysis results are shown as the analyzed program with links to show the memory state at various program points. You can choose to show the memory state at all program points, though note that it can take a few seconds to load all the graphs at once.
| singly-linked lists | |
| list.find.c | find an element |
| list.create.c | create a list |
| list.copy.c | copy a list |
| list.reverse.c | reverse a list |
| binary search trees | |
| bst.find.c | find an element |
| doubly-linked lists | |
| dll.find.c | find an element |
| dll.copy.c | copy a list |
| dll.create.c | create a list |
| dll.reverse.c | reverse a doubly-linked list in place |
| dll.inorderinsert.c | insert an element in a sorted list |
| dll.findnbackloop.c | find an element and walk back from the element |
| trees with parent pointers | |
| rbtree.find.c | find an element |
| rbtree.insertnorebal.c | insert an element (without rebalancing) |
| rbtree.findnbackloop.c | find an element and walk back from the element |
What is the theoretical basis for Xisa?
Xisa is a shape analysis based on an abstract interpretation where the memory states are expressed as logical formulas in a fragment of separation logic. Generalized inductive predicates are used to summarize memory regions. To enable strong updates, materialization is performed by unfolding inductive definitions, and to ensure termination of the analysis, we use a binary widening operator to identify regions that should be folded as inductive predicates. Check out our papers for further details.
The dllist checker given above is written as the following inductive definition in separation logic (with the same pseudo-object-oriented style for the root parameter):
root.dllist(prev_elem) = ∃n.
(emp
∧ root = null)
∨
(root.prev |-> prev_elem ∗ root.next |-> n ∗ n.dllist(root)
∧ root ≠ null)
This view makes explicit the disjointness constraint on graphs, which was left implicit in the pseudo-code notation. Thus, in the computational view, a checker run may read an object field only once during a traversal (i.e., fields are dereferenced linearly).
How is Xisa implemented?
Xisa analyzes C programs. It is implemented in OCaml using the CIL infrastructure for C program analysis and transformation. The memory graph viewer uses the Flare visualization toolkit.