Prerequisites:
  1. Minimal Program Components
  2. Statements, Expressions, and Edges
  3. Annotations
  4. Control Flow Graphs

Units

A Unit is a named container that groups together the globals and code members of a logical program entity. Units represent the structural backbone of a program in LiSA: they correspond to concepts such as files, modules, classes, and interfaces, depending on the language being analyzed. Every CFG and every Global belongs to exactly one unit. Units are gathered into a Program, and one or more programs form an Application that LiSA analyzes as a whole.

This page describes the unit hierarchy, from the abstract Unit base class and its associated globals, through the flat grouping units, up to the CompilationUnit family that models object-oriented type hierarchies.

 Note:
This page contains class diagrams. Interfaces are represented with yellow rectangles, abstract classes with blue rectangles, and concrete classes with green rectangles. After type names, type parameters are reported, but their bounds are omitted for clarity. Only public members are listed in each type: the + symbol marks instance members, the * symbol marks static members, and a ! in front of the name denotes a member with a default implementation. Method-specific type parameters are written before the method name, wrapped in <>. When a class or interface has already been introduced in an earlier diagram, its inner members are omitted.

The Unit class

The Unit abstract class is the common base for all units in LiSA. It groups together a set of globals (variables or constants scoped to the unit) and a set of code members (functions, methods, or procedures contained in it). Within a single unit, each global is uniquely identified by its name, and each code member is uniquely identified by its full signature.

Unit class and globals

Unit provides a uniform API for accessing and searching its contents:

Two abstract methods complete the interface: canBeInstantiated() returns true if instances of the unit can be created at runtime (i.e., it is a concrete class), and getProgram() returns the Program this unit belongs to.

Globals

A Global is a variable or field scoped to a unit. It records the variable’s name, static type (getStaticType()), source location (getLocation()), annotations (getAnnotations()), and the unit that contains it (getContainer()). The isInstance() flag distinguishes instance fields (belonging to each object of the unit) from static globals (belonging to the unit itself). Given a CodeLocation, the toSymbolicVariable() method produces the GlobalVariable symbolic expression used to represent accesses to the global during the analysis.

A ConstantGlobal is a Global that is bound to a fixed, statically known value. It extends Global with a getConstant() method that returns the Constant expression holding that value. Constant globals are never instance globals: they are always scoped at the unit level. Their static type is automatically inferred from the type of the constant.

The Program unit

A Program is a Unit that collects all the units composing a single programming-language program. The main purpose is to act as a registry of all Unit instances parsed from the program’s source, enriched with the type system and the language-specific algorithms (e.g., call resolution — more information on the Types and Language Features pages) and the entry points of the analysis. In a Program instance, globals and code members are typically used to provide always-available built-ins and constants (e.g., Python’s print function).

Program unit

Program provides the following:

 Note:
Program cannot be added as a unit to another Program: calling addUnit with a Program instance raises an exception. Programs are meant to be composed at the Application level.

The Application unit

An Application collects one or more Programs that must be analyzed together. It is the top-level entry point passed to LiSA’s analysis engine, and it supports multi-language analysis by allowing programs written in different languages to coexist.

Application

Application provides aggregated views over all its programs:

Results are lazily computed and cached on first access, so repeated calls to these methods are cheap.

Units for grouping code

Not every unit in a program corresponds to an object-oriented type. Languages such as Python, JavaScript, or C use files and modules as the primary unit of organization, grouping functions and global variables without the notion of instantiable types. LiSA represents these with two classes that sit below Unit in the hierarchy but above the compilation-unit family.

ProgramUnit and CodeUnit

ProgramUnit is the abstract base for all units that can be part of a Program and have a source location. It extends Unit and also implements CodeElement, the minimal interface for program constructs with a location (see Minimal Program Components).

CodeUnit is a concrete, non-instantiable (i.e., whose canBeInstantiated returns false) ProgramUnit that models a file or module: a flat container of globals and code members without any inheritance structure. Frontends for procedural or scripted languages typically create one CodeUnit per source file, populating it with the functions and top-level variables defined in it.

Compilation Units

Compilation units model the object-oriented type constructs of a language — classes, abstract classes, and interfaces — that organize code members and globals into an inheritance hierarchy.

Compilation unit hierarchy

CompilationUnit is the abstract base for all units that participate in an inheritance hierarchy. It extends ProgramUnit and adds the following capabilities beyond those of Unit:

Three concrete subclasses implement the different kinds of object-oriented types:

 Tip:
When implementing a frontend, choose the unit type that best matches the source language construct: CodeUnit for files and modules, ClassUnit for concrete classes, AbstractClassUnit for abstract classes, and InterfaceUnit for interfaces or traits. Use Program to gather all units of a single-language program. Application should never be used directly: when more than one program will be passed to LiSA for the analysis, an Application object is automatically built.