Writing Your Own Analysis
This page walks you through writing an analysis from scratch — first intra-procedural (within one method), then inter-procedural (across method boundaries).
If the terms CFG, dataflow, or fixed point are unfamiliar, read Core Concepts first.
Part 1 — Intra-procedural: a minimal null-tracking analysis
The goal: for each statement in a method, determine which local variables might
hold null at that point.
The approach — forward dataflow:
- Start at the first statement. The initial fact is: "no local is known to be null."
- At each
x = nullassignment, addxto the nullable set. - At each
x = someNonNullExpressionassignment, removex(it can no longer be null on this path). - At join points (where two CFG edges meet), union the sets — if
xis nullable on either incoming path, it is nullable here. - Repeat until the sets stop changing (fixed point).
Dependencies
1 2 3 4 5 | |
1 | |
Implementation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 | |
Using the analysis
1 2 3 4 5 6 7 8 9 10 | |
Note
This is a may analysis (union at merge points) — it reports locals that are null on at least one path. A must analysis would intersect instead: it reports locals that are null on every path. Which you need depends on your goal.
Part 2 — Inter-procedural: IFDS / IDE
Intra-procedural analysis stops at method boundaries. To track whether a value sourced
in readInput() can reach executeQuery() — potentially through several intermediate
methods — you need inter-procedural analysis.
The core insight
The IFDS framework (Interprocedural Finite Distributive Subset problems) reduces inter-procedural dataflow to a graph reachability problem. You describe:
- Which facts are generated at source statements (e.g. "this value came from user input")
- How facts propagate through normal statements (flow functions)
- How facts cross call edges (call-to-start, end-to-return flow functions)
IFDS then builds an exploded supergraph and answers "can fact D reach statement S?" by checking reachability in that graph. This gives provably sound results without manually managing call stacks.
The IDE framework extends IFDS to carry values along with facts (e.g. not just "x is tainted" but "x's value is 3 * input + 7"), enabling constant propagation and linear constant analysis.
Dependencies
1 2 3 4 5 | |
1 | |
Where to look
SootUp integrates with Heros for IFDS/IDE.
Working examples — taint analysis, linear constant propagation, type-state analysis —
are in the test suite of the sootup.analysis.interprocedural module.
They are the most concrete starting point for writing your own inter-procedural analysis.
For background on the algorithms themselves:
What to read next
- Built-in Analyses — liveness and dominance analyses that ship with SootUp
- Call Graphs — required for inter-procedural analysis
- Body Interceptors — pre-processing passes that simplify the IR before analysis