Prerequisites:
  1. Structured Representation and Structured Objects
  2. Scoped Objects
  3. Program Points

Lattices

Following the Abstract Interpretation theory, lattices are the central data structure produced by the analysis. All values returned by domains, fixpoints, and the Interprocedural Analysis are instances of the Lattice interface. This page presents the Lattice interface, its prerequisites, its main implementations and usages.

 Note:
This page contains class diagrams. Interfaces are represented with yellow rectangles, abstract classes with blue rectangles, and concrete classes with green rectangles. After type names, type parameters are reported, but their bounds are omitted for clarity. Only public members are listed in each type: the + symbol marks instance members, the * symbol marks static members, and a ! in front of the name denotes a member with a default implementation. Method-specific type parameters are written before the method name, wrapped in <>. When a class or interface has already been introduced in an earlier diagram, its inner members are omitted.

The Lattice Interface

The Lattice interface inherits from StructuredObject (see the interface definition) and represents an ordered structure.

The Lattice interface

The Lattice interface is designed such that each object of a type that implements Lattice is an element of a partially ordered set that:

Other methods are instead LiSA-specific:

Lattice is parametric to the parameter L that must be a subtype of Lattice<L>. This is a common pattern in Java to represent self-referential types, and it allows to define methods that take or return instances of the same type as the implementing class without the need of downcasting. Note that, despite the name of the interface, implementations do not need to be lattices: most methods have default implementations that reduce the requirements:

The minimum requirements are hence those of a complete partial ordered set, with the addition of a top element for representing unknown values (e.g., user inputs or non-deterministic values).

 Important:
LiSA will only use isTop and isBottom to identify top and bottom elements. Therefore, if your lattice’s top and bottom methods do not return constant values, you must override these methods to correctly identify them.

Common Lattice Instances

When implementing lattices, some common patterns or structures occur frequently. LiSA tries to reduce the effort required to implement these common cases by providing some ready-to-use lattice implementations that can be extended or composed to build more complex lattices.

The BaseLattice Interface

The Base Lattice Interface

The first provided implementation comes as the BaseLattice interface, that is parametric to the parameter L (the lattice type, that must extend BaseLattice<L>) and that extends Lattice<L>. BaseLattice provides default implementations for most methods of the Lattice interface, each handling base cases: in fact, regardless of the lattice structure, if one of the two elements is either top or bottom, or if the two elements are equal, the result of any operation is trivial. Therefore, BaseLattice implements these cases:

In all cases where the return value is not determined by these base cases, BaseLattice delegates the computation to the corresponding auxiliary method (i.e., lessOrEqual delegates to lessOrEqualAux, lub to lubAux, and so on), that are the methods where the actual logic of the lattice is to be implemented. Most auxiliary methods have default implementations that mimik the ones of Lattice: upchainAux and wideningAux default to lubAux, downchainAux and narrowingAux default to glbAux, and glbAux defaults to returning the bottom element.

 Tip:
When implementing a lattice, it is highly recommended to implement BaseLattice instead of directly implementing Lattice, as it reduces the amount of code to be written and the chances of errors.

Powersets and Functions

Common Lattice Instances

The BaseLattice interface is implemented by three common lattice structures:

SetLattice and InverseSetLattice are abstract classes with two type parameters: S, the concrete type of the lattice (that must extend SetLattice<S, E> or InverseSetLattice<S, E>, respectively), and E, the type of the elements contained in the set. Both classes implement BaseLattice<S>, meaning that all lattice operations will return/accept instances of S. They implement the auxiliary methods by performing the corresponding set operations (i.e., subset inclusion for lessOrEqualAux, union/intersection for lubAux and glbAux, and so on), with only top and bottom left to implement in subclasses. The actual sets are tracked in the public field elements, of type Set<E>.

FunctionalLattice is an abstract class with three type parameters: F, the concrete type of the lattice (that must extend FunctionalLattice<F, K, V>), K, the type of the keys in the mapping, and V, the type of the values in the mapping (that must extend Lattice<V>). FunctionalLattice implements BaseLattice<F>, meaning that all lattice operations will return/accept instances of F. It implements the auxiliary methods by applying functional lifting to the lattice operations of the type V (i.e., pointwise comparison for lessOrEqualAux, pointwise least upper bound/greatest lower bound for lubAux/glbAux, and so on), with only top and bottom left to implement in subclasses. The actual mapping is tracked in the public field function, of type Map<K, V>. FunctionalLattice allows flexibility when accessing keys that are not present in the mapping: by overriding the stateOfUnknown method, subclasses can define what value of type V should be assumed for unknown keys.

 Warning:
The concrete type of the sets and maps contained in SetLattice, InverseSetLattice, and FunctionalLattice might be mutable. It is the responsibility of the users of these classes to ensure that their contents are not modified after construction, as this could lead to incorrect results of fixpoint computations.

To avoid creating subclasses for cases where no additional logic is required, LiSA also provides two concrete implementations of the above classes, with GenericSetLattice, GenericInverseSetLattice, and GenericMapLattice.

Products

Classes for building Products

Products of existing lattices can be easily built by constructing new lattices that hold references to them in their fields and implement lattice operators by delegation. LiSA provides such structure for binary products through the CartesianCombination and the LatticeProduct classes.

CartesianCombination is an abstract class parametric to the concrete type C of the combination, and to the two lattices T1 extends Lattice<T1> and T2 extends Lattice<T2> that are combined. CartesianCombination implements BaseLattice<C>, meaning that all lattice operations will return/accept instances of C. The class has two public fields, first and second, that hold the two components of the combination. The auxiliary methods are implemented by delegating the calls to the corresponding methods of the two components, and combining the results into a new instance of C. Instances of C are created through the abstract method mk, that allows to combine the components into the concrete type while possibly applying reductions. In fact, any reduction excluding ones in the lessOrEqual method can be applied in the mk method, as all lattice operations pass through it. If no reductions are required, the LatticeProduct class can be used instead. It is a concrete class parametric to the two lattices T1 extends Lattice<T1> and T2 extends Lattice<T2> that are combined, and it extends CartesianCombination<LatticeProduct<T1, T2>, T1, T2>. The class simply implements the mk method by returning a new instance of itself with the given components.

Note that the reductions mentioned here are intended as ones that must be applied on lattice operators only. Analyses using these products can apply any arbitrary reduction when evaluating instructions, regardless of the class chosen for the lattice structure.

 Tip:
The same structure offered by the above classes can be replicated for n-ary products by either increasing the number of fields or storing each component in an array. LiSA does not provide such classes by default.

Domain Lattices

A DomainLattice is a lattice that is designed to be used by a Semantic Domain to track information about program states. The main features of such lattices are that they track information about program variables (called Identifiers in Symbolic Expressions terms) and that they can be scoped. Scoping is a mechanism provided by LiSA to isolate parts of a lattice element when entering a new context (e.g., a function call) and to restore them when exiting the context. Scoping is essential to implement Interprocedural Analyses, as it allows to track local variables without polluting the global state.

Domain Lattices

Scoping logic is provided by the ScopedObject interface (see the interface definition) For instance, a pushScope implementation on a FunctionalLattice using Identifers as keys would produce a new FunctionalLattice where all Identifiers are renamed to isolate the new scope, while keeping the values unchanged.

The DomainLattice interface is parametric to the type parameters L (the concrete type of the lattice, that must extend DomainLattice<L, T>) and T (the type of value returned by scope operations), and it extends Lattice<L> and ScopedObject<T>. DomainLattice adds four methods that deal with Identifiers:

The Abstract Lattice

An AbstractLattice is a lattice that holds the information tracked by an Abstract Domain about program variables. This is the interface that allows configuration of the analysis state: the concrete implementation of this interface is determined by the analysis configuration, and it is wrapped into the ProgramState and the AnalysisState for LiSA to use. An AbstractLattice is parametric to the type parameter L (the concrete type of the lattice) and it extends DomainLattice<L, L>. The interface defines three methods to manipulate the state:

These serve as quick ways to forget parts of the analysis state when needed, e.g., when analyzing function calls to external libraries.

 Tip:
LiSA provides subtypes of CartesianCombination to quickly implement AbstractLattices as products.

The Program State

The ProgramState is a snapshot of the the state of the program at a given program point. It is mainly a wrapper around the configurable AbstractLattice instance, that holds the actual information tracked by the analysis. ProgramState is a class, parametric to the type parameter A that is the concrete type of the AbstractLattice used by the analysis, that implements DomainLattice<ProgramState<A>, ProgramState<A>> and BaseLattice<ProgramState<A>>. This means that all lattice operators, scoping operations, and identifier manipulations return instances of ProgramState<A>.

The class has three fields: the state, an instance of A that holds the actual information tracked by the analysis; the info, an instance of FixpointInfo (a FunctionalLattice from string keys to heterogeneous Lattice instances) that holds optional auxiliary information produced by the analysis, and the computedExpressions, a set of SymbolicExpressions that represent the result of the last evaluation performed by the analysis. The info field can be used to track program-specific information by Frontends, such as the set of already initialized classes in Java programs, or the set of aliases introduced by Python’s import ... as ... statements. Instead, computedExpressions typically holds the last expression(s) evaluated by the analysis that symbolically represent the contents of the operand stack.

ProgramState implements lattice operations by propagating the calls to its inner components (that are lattices as well) and combining the results into a new ProgramState instance. Scoping operations and identifier manipulations are implemented by forwarding the calls to the inner state field. Additionally, it provides accessors for its components, methods to change the computedExpressions without modifying the rest of the state, and getters and setters for the info field’s mappings.

The Analysis State

The AnalysisState is a wrapper around several ProgramState that correspond to different states of the execution, and is the class that is used by LiSA to track and manipulate the results of semantic computations. The AnalysisState can be viewed as a function from continuations (in their compiler-theoretic meaning) to instances of ProgramState. This allows the analysis to track multiple possible execution paths at the same program point, e.g., when reaching an instruction with a normal state or with an exceptional state. The supported continuations are reflected in this class’ fields:

The latter are stored in a map (i.e., a GenericMapLattice) from Errors (a pair of an error Type and a Statement that raised the error) to the corresponding ProgramStates. LiSA also offers the possibility to smash uninteresting error continuations into a single one through the Configuration. This is useful to reduce the amount of tracked states when the analysis is not interested in specific error types. When smashing is enabled, all smashed error continuations are merged into a single ProgramState, stored in the smashedErrorsState field, and the instruction that raised the smashed errors are tracked in the smashedErrors field (a GenericMapLattice from Types to sets of Statements).

AnalysisState is a class, parametric to the type parameter A that is the concrete type of the AbstractLattice used by the analysis, that implements DomainLattice<AnalysisState<A>, AnalysisState<A>> and BaseLattice<AnalysisState<A>>. This means that all lattice operators, scoping operations, and identifier manipulations return instances of AnalysisState<A>.

Similarly to the ProgramState, AnalysisState implements lattice operations by propagating the calls to its inner components and combining the results into a new AnalysisState instance. Additionally, it provides accessors for its components, accessors for the components of the execution state, proxies for the execution’s ProgramState methods (e.g., getExecutionInfo and withTopMemory) and methods to add or remove new errors and smashed errors. AnalysusState is the lattice instance managed and produced by LiSA’s Analysis.

Recall that AnalysisState, ProgramState, and AbstractLattice all represent the state of the program at a given program point, and to not contain the logic to update it when executing instructions. This logic is instead contained in Semantic Domains, that manipulate these lattices to reflect the effect of instructions on the program state.