Prerequisites:
  1. Units
  2. Control Flow Graphs
  3. Statements, Expressions, and Edges
  4. Types
  5. Annotations

Language Features

A static analysis framework must cope with the diversity of programming languages: how calls are resolved, how arguments are matched to parameters, how the analysis state is scoped when entering a function, how errors are propagated, and what structural invariants a well-formed program must satisfy all vary from one language to the next. In LiSA, these concerns are encapsulated in a set of pluggable strategy interfaces that are grouped together by the LanguageFeatures abstract class and attached to a Program at construction time.

This page describes LanguageFeatures and each of the five strategy interfaces it composes: the parameter matching strategy used to resolve calls, the hierarchy traversal strategy used to walk type hierarchies, the parameter assigning strategy that prepares analysis state before entering a callee, the scoping strategy that manages variable visibility across call boundaries, and the program validation logic that checks structural correctness before the analysis begins.

 Note:
This page contains class diagrams. Interfaces are represented with yellow rectangles, abstract classes with blue rectangles, and concrete classes with green rectangles. After type names, type parameters are reported, but their bounds are omitted for clarity. Only public members are listed in each type: the + symbol marks instance members, the * symbol marks static members, and a ! in front of the name denotes a member with a default implementation. Method-specific type parameters are written before the method name, wrapped in <>. When a class or interface has already been introduced in an earlier diagram, its inner members are omitted.

The LanguageFeatures class

LanguageFeatures is the abstract class that a frontend must subclass to declare the language-specific behaviour of the program it is modeling. An instance of LanguageFeatures is passed to Program and exposed through Program.getFeatures(), making it available throughout the analysis engine.

LanguageFeatures abstract class

Three methods are abstract and must be implemented by every frontend:

Two further methods have default implementations and may optionally be overridden:

Call Resolution

When LiSA encounters an unresolved call, it must determine which concrete code member(s) the call may target. This involves two steps: traversing the type hierarchy to find candidate units (via HierarchyTraversalStrategy), and then checking each candidate’s signature against the call (via ParameterMatchingStrategy). This section covers the matching step.

ParameterMatchingStrategy hierarchy

ParameterMatchingStrategy is the interface that decides whether a call’s actual arguments are compatible with a candidate callee’s formal parameters. It exposes two methods:

Fixed-Order Strategies

FixedOrderMatchingStrategy is an abstract class that implements ParameterMatchingStrategy for calls where the number of actuals must equal the number of formals and each argument is matched positionally. Its concrete matches implementation checks that formals.length == actuals.length and then delegates each position to the abstract single-position method matches(call, pos, formal, actual, types), that checks whether the single actual argument at position pos, with the given runtime type set, is compatible with the corresponding formal parameter.

Three ready-to-use singleton implementations are provided:

Python-Like Matching

PythonLikeMatchingStrategy implements ParameterMatchingStrategy for languages that support default parameter values and keyword (named) arguments. It is constructed with a FixedOrderMatchingStrategy delegate that is used once the positional and keyword slots have been filled in.

Its matches implementation performs matching in three phases:

  1. positional arguments — actual arguments are assigned to the corresponding formal slots in order, stopping as soon as a NamedParameterExpression is encountered (i.e., an expression of the form par_name=some_value);
  2. keyword arguments — each remaining NamedParameterExpression is matched to the formal whose name equals the parameter name and assigned to that slot;
  3. default values — any formal slot left unfilled is assigned its declared default value expression; if no default exists the call is rejected.

The logic for these three phases is factored into the generic static method pythonLogic, which operates over arbitrary element types T and failure values F. This allows the same algorithm to be shared between PythonLikeMatchingStrategy (which fills Expression slots) and PythonLikeAssigningStrategy (which fills ExpressionSet slots during the analysis).

Parameter Assignment

Once a callee has been resolved, LiSA must bind the actual arguments to the formal parameters in the analysis state before starting the fixpoint on the callee’s CFG. The ParameterAssigningStrategy encapsulates this binding.

ParameterAssigningStrategy hierarchy

ParameterAssigningStrategy declares one abstract generic method:

Two implementations are provided:

OrderPreservingAssigningStrategy.INSTANCE assigns each actual expression to the corresponding formal variable in order. For each formal, it computes the join over all symbolic expressions in the corresponding ExpressionSet, using the abstract domain’s assign operation to bind each expression to the formal’s symbolic variable. This strategy is appropriate for any language where arguments are passed positionally with no defaults or keyword arguments.

PythonLikeAssigningStrategy.INSTANCE extends the positional strategy to handle default values and keyword arguments. It first evaluates the default value expression for each formal that has one (running forward semantics on the default expression to obtain its ExpressionSet), then uses the same pythonLogic algorithm from PythonLikeMatchingStrategy to fill in the ExpressionSet slots for each formal. Only after the slots are filled does it run the order-preserving assignment loop.

Hierarchy Traversal

When a yet-to-be-resolved call (i.e., an UnresolvedCall) needs to be resolved, LiSA must know which units to search for a matching callee. The HierarchyTraversalStrategy determines the order in which ancestor units are visited.

HierarchyTraversalStrategy

HierarchyTraversalStrategy is an interface that defines a single method, traverse(st, start), that returns an Iterable<CompilationUnit> that enumerates the units to search, starting from start (the static receiver type) and walking up or across the hierarchy as appropriate for the language.

SingleInheritanceTraversalStrategy.INSTANCE implements a breadth-first walk of the ancestor hierarchy starting from the given unit. At each step it visits the current unit and its immediate ancestors. This strategy is suitable for single-inheritance languages (e.g., Java classes), where at most one superclass exists at each level, though it correctly handles the presence of interfaces by processing all immediate ancestors.

 Note:
For languages with multiple inheritance, implement a custom HierarchyTraversalStrategy that defines the linearisation order appropriate for the language (e.g., C3 linearisation for Python). The traversal strategy only determines the search order; the final selection among matching candidates is handled by the matching and assigning strategies.

Scoping

When the analysis enters a callee, local variables of the caller must not interfere with local variables of the callee that share the same name. ScopingStrategy manages this by pushing and popping variable scopes around each call.

ScopingStrategy

ScopingStrategy is an interface with two abstract methods:

DefaultScopingStrategy implements both methods. Its scope implementation calls AnalysisState.pushScope to push the scope token onto the analysis state and applies ExpressionSet.pushScope to each actual expression so that the symbolic names of the actuals are correctly translated into the callee’s scope. Its unscope implementation checks whether the call returns a value: if not, it simply pops the scope; otherwise, it assigns each return expression to the call’s meta-variable (joining over all return expressions) and then pops the scope.

Program Validation

Before starting the analysis, LiSA validates the Program to catch structural errors introduced by the frontend. The ProgramValidationLogic interface encapsulates this validation pass.

ProgramValidationLogic and BaseValidationLogic

ProgramValidationLogic is an interface with one abstract method: validateAndFinalize(program). The latter inspects the program’s units, code members, globals, and entry points for structural violations and throws a ProgramValidationException describing the first violation found. It also finalizes the program (e.g., populating override and annotation chains) so that the analysis engine can assume a fully resolved program.

BaseValidationLogic is the default implementation. It provides a family of overridable validateAndFinalize and validate methods organized by the type of element being checked that perform the minimal checks on each component:

CodeMember.validate() results in calls to CFG.validate(), that (i) ensures that all control flow structures are well-formed (e.g., all nodes are actually part of the CFG), (ii) ensures no execution-stopping nodes (i.e., ones where stopsExecution() returns true) have successors, (iii) ensures that nodes that do not stop execution have at least one successor, and (iv) checks that all entrypoints of the CFG are actual nodes of the CFG. Then, it launches validation of the inner NodeList, which checks that all edges’ endpoints are part of the NodeList.

Frontends that model a language with additional structural constraints (e.g., a no-cyclic-inheritance rule or restrictions on interface method visibility) should subclass BaseValidationLogic and override the relevant validateAndFinalize overload. The processedUnits set ensures that each unit is only visited once even when the overriding method calls super.

 Important:
validateAndFinalize is called exactly once per analysis, just before the fixpoint computation starts. At the moment it is called, the program graph is fully constructed: all units, code members, globals, and entry points are already in place. After validateAndFinalize returns, the override and annotation chains are considered final and must not be modified.