Example: Constant Propagation
The problem
A variable holds a constant at a program point if every execution path that reaches
that point assigns the same constant value to it. Knowing this lets a compiler replace
int y = x with int y = 4 (no runtime load needed), eliminate dead branches
(if (0 == 0) is always true), and prove the absence of array-index-out-of-bounds errors.
The challenge is that constants must be tracked flow-sensitively: x = 1; x = 4;
means x is 4 after the second assignment, not 1 or "either." This is where dataflow
analysis earns its keep.
This page builds a constant propagation analysis from scratch. It is the forward
counterpart to Live Variable Analysis — read that first if you
haven't already.
Step 1 — The target program
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17 | public class Example {
/**
* A simple method used to illustrate constant propagation.
*
* <p>The variable x is redefined four times before y reads it.
* A flow-sensitive analysis can determine that x holds the constant 4
* at the point of the assignment to y, and therefore y = 4 exactly.
*/
public static void assign() {
int x = 1; // OUT: {x=1}
x = 2; // OUT: {x=2}
x = 3; // OUT: {x=3}
x = 4; // OUT: {x=4}
int y = x; // OUT: {x=4, y=4}
}
}
|
x is redefined four times. A flow-sensitive analysis correctly concludes x = 4
at the final assignment and therefore y = 4. A flow-insensitive approach would
see four definitions and declare x unknown (NAC).
Step 2 — The five elements
| Element |
This analysis |
| Direction |
Forward — constants accumulate as statements execute in order |
| Fact domain |
Map<Local, Value> — for each local, one of UNDEF, Constant(n), or NAC |
| Boundary condition |
Parameters → NAC; all other vars absent (= UNDEF) |
| Initial fact |
Empty map (every variable starts as UNDEF) |
| Meet |
Per-variable meetValue: NAC ⊓ x = NAC; UNDEF ⊓ x = x; c ⊓ c = c; c1 ⊓ c2 = NAC |
| Transfer function |
Evaluate the RHS; update the LHS in the OUT map |
Why forward? Constants flow in the direction of execution: a definition at line 3
affects uses at line 5, not the other way around.
Why map parameters to NAC at the boundary? The caller decides parameter values; from
the method's perspective they are unknown. Anything is possible — NAC is the correct
conservative choice.
Why a three-valued lattice? Two is not enough. UNDEF ("no information yet") is
different from NAC ("too many values"). Without UNDEF, initialising every variable
to NAC before analysis would immediately kill any hope of precision at join points:
NAC ⊓ c = NAC even if only one path actually reaches the join.
Step 3 — The lattice type
The lattice value for one variable. UNDEF is the bottom (least information); NAC
is the top (most information — the analysis has given up on knowing the value); constants
sit in between:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76 | public final class Value {
private enum Kind {
UNDEF,
CONSTANT,
NAC
}
private final Kind kind;
private final int constant; // meaningful only when kind == CONSTANT
private Value(Kind kind, int constant) {
this.kind = kind;
this.constant = constant;
}
public static Value getUndef() {
return new Value(Kind.UNDEF, 0);
}
public static Value getNAC() {
return new Value(Kind.NAC, 0);
}
public static Value makeConstant(int v) {
return new Value(Kind.CONSTANT, v);
}
public boolean isUndef() {
return kind == Kind.UNDEF;
}
public boolean isConstant() {
return kind == Kind.CONSTANT;
}
public boolean isNAC() {
return kind == Kind.NAC;
}
/** Precondition: {@link #isConstant()} must be true. */
public int getConstant() {
if (!isConstant()) throw new IllegalStateException("Not a constant: " + this);
return constant;
}
/** Meet operator: returns the least upper bound of this and {@code other}. */
public Value meet(Value other) {
if (this.isNAC() || other.isNAC()) return getNAC();
if (this.isUndef()) return other;
if (other.isUndef()) return this;
// Both are constants.
return this.constant == other.constant ? this : getNAC();
}
@Override
public boolean equals(Object o) {
if (!(o instanceof Value)) return false;
Value v = (Value) o;
return kind == v.kind && constant == v.constant;
}
@Override
public int hashCode() {
return Objects.hash(kind, constant);
}
@Override
public String toString() {
return switch (kind) {
case UNDEF -> "UNDEF";
case NAC -> "NAC";
case CONSTANT -> String.valueOf(constant);
};
}
}
|
Step 4 — Building the analysis
Fact type and boundary
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15 | @Override
public Map<Local, sootup.examples.constprop.Value> newBoundaryFact(Body body) {
// Parameters are unknown at the call site → NAC.
// All other locals are absent from the map, which means UNDEF.
Map<Local, sootup.examples.constprop.Value> fact = new HashMap<>();
for (Local param : body.getParameterLocals()) {
fact.put(param, sootup.examples.constprop.Value.getNAC());
}
return fact;
}
@Override
public Map<Local, sootup.examples.constprop.Value> newInitialFact() {
return new HashMap<>(); // empty map = all variables UNDEF
}
|
Meet operator
At a join point the facts from both paths are merged per
variable. The meet is applied to each variable's value
independently — if the two incoming paths agree on a constant, we keep it; if they
disagree, the result is NAC:
1
2
3
4
5
6
7
8
9
10
11
12
13 | @Override
public void meetInto(
Map<Local, sootup.examples.constprop.Value> fact,
Map<Local, sootup.examples.constprop.Value> target) {
// For every variable in fact, meet its value into target.
for (Map.Entry<Local, sootup.examples.constprop.Value> entry : fact.entrySet()) {
Local local = entry.getKey();
sootup.examples.constprop.Value incoming = entry.getValue();
sootup.examples.constprop.Value existing =
target.getOrDefault(local, sootup.examples.constprop.Value.getUndef());
target.put(local, existing.meet(incoming));
}
}
|
Transfer function
For assignment statements we evaluate the right-hand side expression and update the
left-hand side. For all other statements the transfer is the identity (OUT = IN):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29 | @Override
public boolean transferNode(
Stmt stmt,
Map<Local, sootup.examples.constprop.Value> in,
Map<Local, sootup.examples.constprop.Value> out) {
// Copy IN into OUT as the default (identity for non-assignment statements).
Map<Local, sootup.examples.constprop.Value> newOut = new HashMap<>(in);
if (stmt instanceof JAssignStmt) {
JAssignStmt assign = (JAssignStmt) stmt;
Value lhs = assign.getLeftOp();
Value rhs = assign.getRightOp();
if (lhs instanceof Local) {
Local lhsLocal = (Local) lhs;
sootup.examples.constprop.Value result = evaluate(rhs, in);
newOut.put(lhsLocal, result);
}
}
// (Other statement types: identity — OUT = IN, already handled by the copy above.)
if (newOut.equals(out)) {
return false;
}
out.clear();
out.putAll(newOut);
return true;
}
|
Step 5 — The worklist solver
The same DataflowSolver used for liveness drives this analysis too — only the direction
changes. In the forward case it meets all predecessor OUT facts into IN, applies the
transfer, and re-schedules successors when OUT changed:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30 | // Seed the worklist with all nodes (conservative: assume every fact may change).
Deque<Stmt> worklist = new ArrayDeque<>(nodes);
while (!worklist.isEmpty()) {
Stmt node = worklist.poll();
if (analysis.isForward()) {
// Forward: meet all predecessor OUT facts into this node's IN fact.
F in = inFacts.get(node);
for (Stmt pred : cfg.predecessors(node)) {
analysis.meetInto(outFacts.get(pred), in);
}
// Apply the transfer function; if OUT changed, schedule successors.
if (analysis.transferNode(node, in, outFacts.get(node))) {
worklist.addAll(cfg.successors(node));
}
} else {
// Backward: meet all successor IN facts into this node's OUT fact.
F out = outFacts.get(node);
List<Stmt> succs = cfg.successors(node);
succs.addAll(cfg.exceptionalSuccessors(node).values());
for (Stmt succ : succs) {
analysis.meetInto(inFacts.get(succ), out);
}
// Apply the transfer function; if IN changed, schedule predecessors.
if (analysis.transferNode(node, inFacts.get(node), out)) {
worklist.addAll(cfg.predecessors(node));
}
}
}
|
Hand trace on Example.assign (no branches, so a single pass suffices):
| Statement |
IN |
OUT |
x = 1 |
{} |
{x=1} |
x = 2 |
{x=1} |
{x=2} |
x = 3 |
{x=2} |
{x=3} |
x = 4 |
{x=3} |
{x=4} |
y = x |
{x=4} |
{x=4, y=4} |
return |
{x=4, y=4} |
— |
Step 6 — Setting up and running the analysis
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15 | AnalysisInputLocation inputLocation =
PathBasedAnalysisInputLocation.create(
Paths.get("src/test/resources/ConstProp/binary"), SourceType.Application);
View view = new JavaView(inputLocation);
ClassType classType = view.getIdentifierFactory().getClassType("Example");
SootClass sootClass = view.getClass(classType).get();
MethodSignature methodSignature =
view.getIdentifierFactory().parseMethodSignature("<Example: void assign()>");
SootMethod method = sootClass.getMethod(methodSignature.getSubSignature()).get();
Body body = method.getBody();
// Run constant propagation and retrieve the solver for querying results.
DataflowSolver<Map<Local, Value>> solver = ConstantPropagation.analyze(body);
|
Step 7 — Verifying the result
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15 | // At the last assignment (y = x), OUT must contain {x=4, y=4}.
// The exact last assignment stmt is the final "non-return" stmt, but we check
// the return stmt's IN fact, which equals the OUT of the preceding assignment.
Stmt lastAssign = stmts.get(stmts.size() - 2); // last stmt before return
Map<Local, Value> out = solver.getOutFact(lastAssign);
// Every local with a constant value must equal the expected constant.
for (Map.Entry<Local, Value> entry : out.entrySet()) {
Local local = entry.getKey();
Value value = entry.getValue();
assertTrue(value.isConstant(), local.getName() + " should be a constant but was " + value);
if (local.getName().equals("x") || local.getName().equals("y")) {
assertEquals(4, value.getConstant(), local.getName() + " should equal 4");
}
}
|
The test asserts that at the final assignment, every local with a known value holds the
constant 4. CI runs this on every pull request; a breaking API change will fail this
test before it can silently invalidate the docs.
What to try next