Static Analysis Glossary
Definitions for technical terms used throughout this documentation.
- Analysis fact
- A value (a set, a map, a number) that summarises what the analysis knows about
program state at a single point in the CFG. The type of
the fact is chosen by the analysis designer; for liveness it is
Set<Local>, for constant propagation it isMap<Local, Value>. - Backward analysis
- An analysis in which facts travel from successors to predecessors — that is, from the method exit toward the method entry. Backward analyses answer questions about the future of a value: "will this variable be read again?" See also Forward analysis.
- Boundary condition
- The analysis fact assigned to the boundary node before iteration begins — the method entry for forward analyses, the method exit for backward analyses. It represents what the analysis knows outside the method body. For liveness the boundary is the empty set (nothing is live after a return); for constant propagation it maps all parameters to NAC (their values are unknown at the call site).
- Control Flow Graph (CFG)
- A directed graph whose nodes are individual Jimple statements and
whose edges represent possible execution order: an edge A → B means "statement B
may execute immediately after statement A." An
ifcreates two outgoing edges; a loop back-edge returns to an earlier node. In SootUp,body.getControlFlowGraph()returns aControlFlowGraphwhosesuccessors(stmt)andpredecessors(stmt)give the adjacent nodes. - Dataflow analysis
- An analysis that computes an analysis fact at each node of a CFG by repeatedly applying transfer functions and meet operators until a fixed point is reached. The word "dataflow" refers to facts "flowing" along CFG edges during iteration.
- Fixed point
- The state in which one more iteration of the dataflow analysis would change no fact anywhere in the CFG. At this point the analysis has converged to its final answer. The worklist algorithm detects this implicitly: it terminates when the worklist is empty, which happens exactly when nothing changed.
- Forward analysis
- An analysis in which facts travel from predecessors to successors — that is, from the method entry toward the method exit. Forward analyses answer questions about the past of a value: "what constant was assigned to this variable?" See also Backward analysis.
- Gen/Kill
- Informal names for the two parts of a transfer function.
The gen set contains facts that the statement produces (e.g. a definition of a
variable is "generated"). The kill set contains facts that the statement
invalidates (e.g. a new definition of
xkills any previous definition ofx). Transfer for most analyses reduces to:OUT = gen(S) ∪ (IN − kill(S)). - Initial fact
- The analysis fact assigned to every non-boundary node at the start of iteration. Typically the "bottom" element of the lattice — the empty set for set-based analyses, the empty map for map-based ones. Facts can only grow (under union) or change (under more complex meet) during iteration; they never shrink back below the initial value.
- Intermediate Representation (IR)
- An internal form of a program that is easier to traverse and analyse than either source code or raw bytecode. SootUp's IR is Jimple: a flat, three-address, register-machine representation derived from JVM bytecode.
- Join point
- A CFG node with more than one incoming edge — for example,
the first statement after an
if/elseor the header of a loop. At a join point, facts from multiple paths must be combined using the meet operator. - Lattice
- A partially ordered set in which every two elements have a least upper bound (join, ⊔) and a greatest lower bound (meet, ⊓). The analysis fact of a dataflow analysis forms a lattice; the ordering represents "more information" vs. "less information." For a set-based analysis, the lattice is the power set of locals ordered by ⊆, with ∪ as join and ∩ as meet.
- May analysis
- An analysis that over-approximates: it reports everything that could be true on at least one execution path. Union at join points ("if it is true on either path, report it"). May analyses are sound but may report false positives. Liveness is a may analysis: a variable is reported live if it might be used on some path, even if not on every path.
- Meet operator (⊓)
- The function that combines analysis facts from multiple incoming
paths at a join point. For may analyses this is
union (∪); for must analyses this is intersection (∩). In the
SootUp educational framework,
meetInto(fact, target)applies the meet by mutatingtargetin place. - Must analysis
- An analysis that under-approximates: it reports only what is true on every execution path. Intersection at join points ("only report it if it is true on all paths"). Must analyses have no false positives but may have false negatives. See also May analysis.
- Transfer function
- A function that maps the analysis fact at one side of a statement
to the fact at the other side, based on the semantics of that statement.
For forward analyses:
OUT[S] = f(IN[S]). For backward analyses:IN[S] = f(OUT[S]). See Gen/Kill for the most common pattern. - Worklist
- The set of CFG nodes whose facts may still change and therefore need to be re-processed. The solver initialises the worklist with all nodes, then picks one at a time: if the node's fact changes after applying the transfer function and meet, the affected neighbours are added back to the worklist. The algorithm terminates (reaches a fixed point) when the worklist is empty.