Common Interfaces for the Analysis Structure
This page contains information about common interfaces used throughout the analysis infrastructure of LiSA, that are applied to several components to model shared properties, requirements and behaviors.
This page contains class diagrams. Interfaces are represented with yellow rectangles, abstract classes with blue rectangles, and concrete classes with green rectangles. After type names, type parameters are reported, but their bounds are omitted for clarity. Only public members are listed in each type: the
+ symbol marks instance members, the * symbol marks
static members, and a ! in front of the name denotes a member with a default
implementation. Method-specific type parameters are written before the method
name, wrapped in <>. When a class or interface has already been introduced in
an earlier diagram, its inner members are omitted.
The Structured Representation Interface
A StructuredRepresentation is a way to represent the contents of a complex
object in a structured way, such that it is (i) independent of its source,
(ii) comparable with other representations (potentially originating from
a different source), and (iii) serialisable. StricturedRepresentations
are mainly used to produce human-readable representations of Lattice
elements, and to serialize them in ouput files into a unique format so that
several visualization tools can be built on top of the same output.
StructuredRepresentation is ab abstract class, that has five concrete
subtypes:
StringRepresentation: a representation of any object as a string;SetRepresentation: a representation of any object as a sorted set ofStructuredRepresentationelements;ListRepresentation: a representation of any object as an ordered list ofStructuredRepresentationelements;MapRepresentation: a representation of any object as a map fromStructuredRepresentationkeys toStructuredRepresentationelements;ObjectRepresentation: a representation of any object as a named collection of fields, each field being aStructuredRepresentationelement.
Instances of these classes just have to be created by passing the appropriate values, and they will automatically provide the required functionalities (like comparability and serializability).
A StrucutredObject is any object that can produce a StructuredRepresentation
of itself. The Lattice interface extends the StructuredObject interface,
meaning that all lattices can produce a structured representation of
themselves through the representation method.
The Scoped Object Interface
The ScopedObject interface defines the common operations objects can be scoped.
Scoping is a mechanism provided bu LiSA to
isolate parts of an object when entering a new context (e.g., a function
call) and to restore them when exiting the context. Scoping is essential to
implement Interprocedural Analyses,
as it allows to track caller’s variables without polluting the callee state.
ScopedObject is parametric on the
type T that is returned by its methods. The interface defines two
methods: pushScope, that returns a new instance of the object where
all information contained in it becomes hidden by the given scope token,
and popScope, that restores information in the receiver by removing
the scope specified by the token parameter.
Implementations of these methods usually manipulate program variables
(called Identifiers in
Symbolic Expressions
terms) by applying a sort of renaming: since SymbolicExpressions are
instances of ScopedObject, pushScope and popScope implementations
should recursively invoke these methods on all symbolic expression references
they contain. This will cause an identifier x to be renamed to [scope]x,
such that it won’t conflict with later definitions of x in inner scopes.
Scopes are indentified by ScopeToken instances, that are wrappers around a
CodeElement (i.e., any program construct that has a position in the source
program). This allows to easily identify scopes with program constructs
like function calls. Both CodeElement and ProgramPoint are defined in the
next section.
Minimal Program Components
To reduce dependencies between the analysis structure and the program structure, methods of analysis components that need to refer to program constructs use (when possible) minimal interfaces that expose only the necessary information.
Three such interfaces are used throughout the analysis structure:
CodeLocation: instances of this interface represent a position in the source program; it exposes a single method,getCodeLocation, that returns a textual representation of that location; note that since the program might be composed by either source files or binary files, no structure is imposed toCodeLocations as they might point to lines in a source file or offsets in a binary file;CodeElement: instances of this interface represent program constructs that have a position in the source program; since the program might be composed by either source files or binary files, the structure of aCodeElementis minimal, exposing only thegetLocationmethod that returns theCodeLocationwhere the element is defined;ProgramPoint: instances of this interface represent specific points inside a control flow graph (CFG), that is part of aUnitof theProgram; the main objective of this interface is to provide a way for analysis components to retrieve theProgramwhere an instruction lies, so that it can be queried for language-specific properties.
Read more about CFGs, Units and Programs in their dedicated pages.