Skip to content

Example: Constant Propagation

The problem

A variable holds a constant at a program point if every execution path that reaches that point assigns the same constant value to it. Knowing this lets a compiler replace int y = x with int y = 4 (no runtime load needed), eliminate dead branches (if (0 == 0) is always true), and prove the absence of array-index-out-of-bounds errors.

The challenge is that constants must be tracked flow-sensitively: x = 1; x = 4; means x is 4 after the second assignment, not 1 or "either." This is where dataflow analysis earns its keep.

This page builds a constant propagation analysis from scratch. It is the forward counterpart to Live Variable Analysis — read that first if you haven't already.


Step 1 — The target program

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
public class Example {

    /**
     * A simple method used to illustrate constant propagation.
     *
     * <p>The variable x is redefined four times before y reads it.
     * A flow-sensitive analysis can determine that x holds the constant 4
     * at the point of the assignment to y, and therefore y = 4 exactly.
     */
    public static void assign() {
        int x = 1;   // OUT: {x=1}
        x = 2;       // OUT: {x=2}
        x = 3;       // OUT: {x=3}
        x = 4;       // OUT: {x=4}
        int y = x;   // OUT: {x=4, y=4}
    }
}

x is redefined four times. A flow-sensitive analysis correctly concludes x = 4 at the final assignment and therefore y = 4. A flow-insensitive approach would see four definitions and declare x unknown (NAC).


Step 2 — The five elements

Element This analysis
Direction Forward — constants accumulate as statements execute in order
Fact domain Map<Local, Value> — for each local, one of UNDEF, Constant(n), or NAC
Boundary condition Parameters → NAC; all other vars absent (= UNDEF)
Initial fact Empty map (every variable starts as UNDEF)
Meet Per-variable meetValue: NAC ⊓ x = NAC; UNDEF ⊓ x = x; c ⊓ c = c; c1 ⊓ c2 = NAC
Transfer function Evaluate the RHS; update the LHS in the OUT map

Why forward? Constants flow in the direction of execution: a definition at line 3 affects uses at line 5, not the other way around.

Why map parameters to NAC at the boundary? The caller decides parameter values; from the method's perspective they are unknown. Anything is possible — NAC is the correct conservative choice.

Why a three-valued lattice? Two is not enough. UNDEF ("no information yet") is different from NAC ("too many values"). Without UNDEF, initialising every variable to NAC before analysis would immediately kill any hope of precision at join points: NAC ⊓ c = NAC even if only one path actually reaches the join.


Step 3 — The lattice type

The lattice value for one variable. UNDEF is the bottom (least information); NAC is the top (most information — the analysis has given up on knowing the value); constants sit in between:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
public final class Value {

  private enum Kind {
    UNDEF,
    CONSTANT,
    NAC
  }

  private final Kind kind;
  private final int constant; // meaningful only when kind == CONSTANT

  private Value(Kind kind, int constant) {
    this.kind = kind;
    this.constant = constant;
  }

  public static Value getUndef() {
    return new Value(Kind.UNDEF, 0);
  }

  public static Value getNAC() {
    return new Value(Kind.NAC, 0);
  }

  public static Value makeConstant(int v) {
    return new Value(Kind.CONSTANT, v);
  }

  public boolean isUndef() {
    return kind == Kind.UNDEF;
  }

  public boolean isConstant() {
    return kind == Kind.CONSTANT;
  }

  public boolean isNAC() {
    return kind == Kind.NAC;
  }

  /** Precondition: {@link #isConstant()} must be true. */
  public int getConstant() {
    if (!isConstant()) throw new IllegalStateException("Not a constant: " + this);
    return constant;
  }

  /** Meet operator: returns the least upper bound of this and {@code other}. */
  public Value meet(Value other) {
    if (this.isNAC() || other.isNAC()) return getNAC();
    if (this.isUndef()) return other;
    if (other.isUndef()) return this;
    // Both are constants.
    return this.constant == other.constant ? this : getNAC();
  }

  @Override
  public boolean equals(Object o) {
    if (!(o instanceof Value)) return false;
    Value v = (Value) o;
    return kind == v.kind && constant == v.constant;
  }

  @Override
  public int hashCode() {
    return Objects.hash(kind, constant);
  }

  @Override
  public String toString() {
    return switch (kind) {
      case UNDEF -> "UNDEF";
      case NAC -> "NAC";
      case CONSTANT -> String.valueOf(constant);
    };
  }
}

Step 4 — Building the analysis

Fact type and boundary

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
  @Override
  public Map<Local, sootup.examples.constprop.Value> newBoundaryFact(Body body) {
    // Parameters are unknown at the call site → NAC.
    // All other locals are absent from the map, which means UNDEF.
    Map<Local, sootup.examples.constprop.Value> fact = new HashMap<>();
    for (Local param : body.getParameterLocals()) {
      fact.put(param, sootup.examples.constprop.Value.getNAC());
    }
    return fact;
  }

  @Override
  public Map<Local, sootup.examples.constprop.Value> newInitialFact() {
    return new HashMap<>(); // empty map = all variables UNDEF
  }

Meet operator

At a join point the facts from both paths are merged per variable. The meet is applied to each variable's value independently — if the two incoming paths agree on a constant, we keep it; if they disagree, the result is NAC:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
  @Override
  public void meetInto(
      Map<Local, sootup.examples.constprop.Value> fact,
      Map<Local, sootup.examples.constprop.Value> target) {
    // For every variable in fact, meet its value into target.
    for (Map.Entry<Local, sootup.examples.constprop.Value> entry : fact.entrySet()) {
      Local local = entry.getKey();
      sootup.examples.constprop.Value incoming = entry.getValue();
      sootup.examples.constprop.Value existing =
          target.getOrDefault(local, sootup.examples.constprop.Value.getUndef());
      target.put(local, existing.meet(incoming));
    }
  }

Transfer function

For assignment statements we evaluate the right-hand side expression and update the left-hand side. For all other statements the transfer is the identity (OUT = IN):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
  @Override
  public boolean transferNode(
      Stmt stmt,
      Map<Local, sootup.examples.constprop.Value> in,
      Map<Local, sootup.examples.constprop.Value> out) {

    // Copy IN into OUT as the default (identity for non-assignment statements).
    Map<Local, sootup.examples.constprop.Value> newOut = new HashMap<>(in);

    if (stmt instanceof JAssignStmt) {
      JAssignStmt assign = (JAssignStmt) stmt;
      Value lhs = assign.getLeftOp();
      Value rhs = assign.getRightOp();

      if (lhs instanceof Local) {
        Local lhsLocal = (Local) lhs;
        sootup.examples.constprop.Value result = evaluate(rhs, in);
        newOut.put(lhsLocal, result);
      }
    }
    // (Other statement types: identity — OUT = IN, already handled by the copy above.)

    if (newOut.equals(out)) {
      return false;
    }
    out.clear();
    out.putAll(newOut);
    return true;
  }

Step 5 — The worklist solver

The same DataflowSolver used for liveness drives this analysis too — only the direction changes. In the forward case it meets all predecessor OUT facts into IN, applies the transfer, and re-schedules successors when OUT changed:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    // Seed the worklist with all nodes (conservative: assume every fact may change).
    Deque<Stmt> worklist = new ArrayDeque<>(nodes);

    while (!worklist.isEmpty()) {
      Stmt node = worklist.poll();

      if (analysis.isForward()) {
        // Forward: meet all predecessor OUT facts into this node's IN fact.
        F in = inFacts.get(node);
        for (Stmt pred : cfg.predecessors(node)) {
          analysis.meetInto(outFacts.get(pred), in);
        }
        // Apply the transfer function; if OUT changed, schedule successors.
        if (analysis.transferNode(node, in, outFacts.get(node))) {
          worklist.addAll(cfg.successors(node));
        }
      } else {
        // Backward: meet all successor IN facts into this node's OUT fact.
        F out = outFacts.get(node);
        List<Stmt> succs = cfg.successors(node);
        succs.addAll(cfg.exceptionalSuccessors(node).values());
        for (Stmt succ : succs) {
          analysis.meetInto(inFacts.get(succ), out);
        }
        // Apply the transfer function; if IN changed, schedule predecessors.
        if (analysis.transferNode(node, inFacts.get(node), out)) {
          worklist.addAll(cfg.predecessors(node));
        }
      }
    }

Hand trace on Example.assign (no branches, so a single pass suffices):

Statement IN OUT
x = 1 {} {x=1}
x = 2 {x=1} {x=2}
x = 3 {x=2} {x=3}
x = 4 {x=3} {x=4}
y = x {x=4} {x=4, y=4}
return {x=4, y=4}

Step 6 — Setting up and running the analysis

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
    AnalysisInputLocation inputLocation =
        PathBasedAnalysisInputLocation.create(
            Paths.get("src/test/resources/ConstProp/binary"), SourceType.Application);
    View view = new JavaView(inputLocation);

    ClassType classType = view.getIdentifierFactory().getClassType("Example");
    SootClass sootClass = view.getClass(classType).get();

    MethodSignature methodSignature =
        view.getIdentifierFactory().parseMethodSignature("<Example: void assign()>");
    SootMethod method = sootClass.getMethod(methodSignature.getSubSignature()).get();
    Body body = method.getBody();

    // Run constant propagation and retrieve the solver for querying results.
    DataflowSolver<Map<Local, Value>> solver = ConstantPropagation.analyze(body);

Step 7 — Verifying the result

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
    // At the last assignment (y = x), OUT must contain {x=4, y=4}.
    // The exact last assignment stmt is the final "non-return" stmt, but we check
    // the return stmt's IN fact, which equals the OUT of the preceding assignment.
    Stmt lastAssign = stmts.get(stmts.size() - 2); // last stmt before return
    Map<Local, Value> out = solver.getOutFact(lastAssign);

    // Every local with a constant value must equal the expected constant.
    for (Map.Entry<Local, Value> entry : out.entrySet()) {
      Local local = entry.getKey();
      Value value = entry.getValue();
      assertTrue(value.isConstant(), local.getName() + " should be a constant but was " + value);
      if (local.getName().equals("x") || local.getName().equals("y")) {
        assertEquals(4, value.getConstant(), local.getName() + " should equal 4");
      }
    }

The test asserts that at the final assignment, every local with a known value holds the constant 4. CI runs this on every pull request; a breaking API change will fail this test before it can silently invalidate the docs.


What to try next