GCC Constant Propagation
Table of Contents
1. GCC Constant Propagation
http://www.cse.iitm.ac.in/~rupesh/teaching/pa/jan17/scribes/0-cp.pdf (local archive)
常量传播需要解决的问题是: 一个变量 x 在某个代码处 u 是否是一个常量.
解决方案是: 确认代码通过所有可能的 path 到达 u 时 x 都是同一个常量. 例如:
1 x = 1; 2 z = 3; 3 while (x > 0) { 4 if (x = 1) { 5 y = 7; 6 } else { 7 y = z + 4; 8 x = 3; 9 } 10 print y; 11 }
y 在 line 10 的位置可以确认是 7
1.1. kildall algorithm
kildall algorithm 是最简单的 cp 算法, 它把 condition stmt 看做是普通的 stmt, 不会通过 evaluate 获得更精确的 cp 结果, 所以 kildall 算法无法发现所有的常量.
1.1.1. lattice
维护一个 mapping M[n_stmt][n_var], 表示每个 var 在每个 stmt 处的状态, 这个状态称为 lattice, 有三种不同的值:
- \(\top\), 表示 `可能是常量`
- \(N\), 表示 `是常量 N`
- \(\bot\), bottom, 表示 `不是常量`
初始时 M 中都是 \(\top\)
1.1.2. flow function
每个 stmt 都有一个 flow function, 代表 stmt 会如何改变 M, flow function 的定义为:
flow_function(stmt): if stmt.type != assign: /* stmt 有多个 predecessor */ if stmt is a merge_point: stmt1, stmt2 = predecessors M[stmt] = meet(stmt1, stmt2); return /* stmt 是针对 v 的赋值 */ v = stmt.lhs if stmt.rhs.type == constant: M[stmt][v].latice=CONSTANT M[stmt][v].value=stmt.rhs.val else: /* rhs 是 x op y 的形式 */ x = expr.lhs y = expr.rhs if M[stmt][x].latice = CONSTANT && M[stmt][y].latice = CONSTANT: M[stmt][v].lattice = CONSTANT M[stmt][v].value = expr.op(x, y); return if M[stmt][x].latice == BOTTOM || M[stmt][y].latice == BOTTOM: M[stmt][v].lattice = BOTTOM return M[stmt][v].lattice = TOP meet(stmt1, stmt2); m1 = M[stmt1] m2 = M[stmt2] for v in V: if m1[v].lattice == CONSTANT && m2[v].lattice == CONSTANT: if m1[v].value == m2[v].value: ret[v].lattice = CONSTANT ret[v].value = m1[v].value else: ret[v].lattice = BOTTOM continue if m1[v].lattice == BOTTOM || m2[v].lattice == BOTTOM: ret[v].lattice = BOTTOM continue if m1[v].lattice == TOP: ret[v] = m2[v] else: ret[v] = m1[v]
1.1.3. algorithm
# M 初始化为 TOP init(M) do changed=false worklist=(entry) while worklist is not empty do stmt=worklist.pop() visited[stmt]=true flow_function(stmt) foreach successor s of stmt: if visited[s]: continue M[s]=meet(M[s], M[stmt]) if M[s] changes: changed=true worklist.add(s) end end while (changed)
一个简化的描述是:
- 针对每个 stmt 用 flow_function 更新 M
- 如果 stmt 是 merge, 需要通过 meet 来更新 M
- 把 M propagate 给 stmt 的所有 successor
1.1.4. example
1 x = 10;
2 y = 1;
3 z = 1;
4 while (x > 1):
5 y = x∗y;
6 x = x−1;
7 z = z ∗ z;
8 p = x + y + z;
iter_1 | iter_2 | iter_3 | |
---|---|---|---|
0 | T,T,T,T | T,T,T,T | T,T,T,T |
1 | 10,T,T,T | 10,T,T,T | 10,T,T,T |
2 | 10,1,T,T | 10,1,T,T | 10,1,T,T |
3 | 10,1,1,T | 10,1,1,T | 10,1,1,T |
4 | 10,1,1,T | B,B,1,T | B,B,1,T |
5 | 10,10,1,T | B,B,1,T | B,B,1,T |
6 | 9,10,1,T | B,B,1,T | B,B,1,T |
7 | 9,10,1,T | B,B,1,T | B,B,1,T |
8 | 10,1,1,12 | B,B,1,B | B,B,1,B |
1.1.4.1. iter_1
- x=10, M[1]={10,T,T,T}, meet: M[2]={10,T,T,T}
- y=1, M[2]={10,1,T,T}, meet: M[3]={10,1,T,T}
- z=1, M[3]={10,1,1,T}, meet: M[4]={10,1,1,T}
- while, merge (3,7): M[4]=meet(3,7)={10,1,1,T}, meet: M[5]={10,1,1,T}, M[8]={10,1,1,T}
- y=x*y, M[5]={10,10,1,T}, meet: M[6]={10,10,1,T}
- x=x-1, M[6]={9,10,1,T}, meet: M[7]={9,10,1,T}
- z=z*z, M[7]={9,10,1,T}
- p=x+y+z, M[8]={10,1,1,12}
1.1.4.2. iter_2
- line 4, merge(3,7): M[4]= {B,B,1,T}, meet: M[8]={B,B,1,12}
- line 8, p=x+y+z, 因为 x,y 均为 B, 导致 p 也为 B, 最终 M[8]={B,B,1,B}
1.2. wegbreit algorithm
wegbreit algorithm 是 conditional constant propagation (ccp) 算法: 它会通过对 condition 进行 evaluate 获得更准确的常量信息.
wegbreit 与 kildal 类似, 但有个不同的地方:
conditional stmt 会被 eval, 根据 eval 的结果决定哪些 stmt 会被加入 worklist, 如果 eval 结果为 BOTTOM, 则与 kildal 相同, 如果 eval 结果为常量, 则根据常量的值决定哪个分支加入 worklist
1.2.1. example
1 i = 1; 2 if (i == 1) { 3 j = 2; 4 } else { 5 j = 3; 6 } 7 if (j != 2) { 8 k = 3; 9 } else { 10 k = 4; 11 } 12 print (i,j, k);
使用 kildal 的结果:
iter_1 | iter_2 | |
---|---|---|
0 | T,T,T | … |
1 | 1,T,T | |
2 | 1,T,T | |
3 | 1,2,T | |
5 | 1,3,T | |
7 | 1,B,T | |
8 | 1,B,3 | |
10 | 1,B,4 | |
12 | 1,B,B |
使用 wegbreit 的结果:
iter_1 | iter_2 | |
---|---|---|
0 | T,T,T | … |
1 | 1,T,T | |
2 | 1,T,T | |
3 | 1,2,T | |
5 | ||
7 | 1,2,T | |
8 | ||
10 | 1,2,4 | |
12 | 1,2,4 |
1.3. bitwise constant propagation
考虑下面的代码:
if (x > 0) { x |= 0x11; } else { x |= 0x111; } uint32_t y = x & 0x11;
普通的 cp 算法无法发现 y 是常量 0x11
gcc 的 ccp 算法支持这种 bitwise cp (-ftree-bit-ccp): 除了维护 lattice 和 value, 它还有一个 mask, 表示 `value&(~mask)` 是 constant.
1.3.1. example
以 strlen 为例,
check_null 使用 int 时:
$> /opt/riscv/bin/riscv64-unknown-linux-gnu-gcc test.c -O2 -fdump-tree-ccp1-details $> cat ./a-test.c.032t.ccp1 Visiting statement: _7 = MEM[(int *)data_27]; which is likely CONSTANT Lattice value changed to VARYING. Adding SSA edges to worklist. Visiting statement: _34 = _7 & 2139062143; which is likely CONSTANT Lattice value changed to CONSTANT 0x0 (0x7f7f7f7f). Adding SSA edges to worklist. marking stmt to be not simulated again Visiting statement: _42 = _34 + 2139062143; which is likely CONSTANT Lattice value changed to CONSTANT 0x0 (0x7fffffff). Adding SSA edges to worklist. marking stmt to be not simulated again Visiting statement: _43 = _42 | 2139062143; which is likely CONSTANT Lattice value changed to CONSTANT 0x7f7f7f7f (0x808080). Adding SSA edges to worklist. marking stmt to be not simulated again Visiting statement: _44 = ~_43; which is likely CONSTANT Lattice value changed to CONSTANT 0xffffffffffffffffffffffffffffffffffffffff80000000 (0x808080). Adding SSA edges to worklist. marking stmt to be not simulated again
以下面的输出为例:
Visiting statement: _34 = _7 & 2139062143; which is likely CONSTANT Lattice value changed to CONSTANT 0x0 (0x7f7f7f7f). Adding SSA edges to worklist. marking stmt to be not simulated again
`CONSTANT 0x0 (0x7f7f7f7f)` 表示 gcc 认为 `_34 & ~(0x7f7f7f7f) == 0x0`, 即 _34 是 `0b0xxxxxxx0xxxxxxx0xxxxxxx0xxxxxxx` 的形式, 这个是正确的结论, 因为 `_34 = _7 & 0x7f7f7f7f`
出现问题的是下面一句:
Visiting statement: _42 = _34 + 2139062143; which is likely CONSTANT Lattice value changed to CONSTANT 0x0 (0x7fffffff). Adding SSA edges to worklist. marking stmt to be not simulated again
gcc 认为 _34+0x7f7f7f7f 的最高位一定是 0 (_42 & ~0x7fffffff == 0x0), 即 0b0xxxxxxx….+0x7f7f7f7f 不会是负数
比较使用 uint 时 gcc 的输出:
Visiting statement: _35 = _8 & 2139062143; which is likely CONSTANT Lattice value changed to CONSTANT 0x0 (0x7f7f7f7f). Adding SSA edges to worklist. marking stmt to be not simulated again Visiting statement: _43 = _35 + 2139062143; which is likely CONSTANT Lattice value changed to VARYING. Adding SSA edges to worklist.
上面关于 `_42 = _34 + 2139062143` 的问题最终导致使用 int 时 gcc 认为 check_null 是常数
1.3.2. evrp pass
对于 `_42 = _34 + 2139062143`, 为什么它的 lattice value 是 `CONSTANT 0x0 (0x7fffffff)` ?
通过跟踪 tree-ssa-ccp.c::evaluate_stmt, 能够确认 0x7fffffff 来自 _42 的 nonzero_bits, 而这个值是 evrp pass (early value range propagation) 通过分析每个 ssa 的 value range 决定的 (set_range_info), 例如, 假设 evrp 分析后认为 x 的 value range 是 [0, 255], 则 x 的高 24 位可以确认全都是 0.
看起来 evrp 的加法是 saturating (饱和) 的.
所以这个 check_null 的 bug 是应用代码的 bug, 根本原因是 int 加法 overflow
是
undefined behavior:
C99 6.5/5:
If an exceptional condition occurs during the evaluation of an expression (that is, if the result is not mathematically defined or not in the range of representable values for its type), the behavior is undefined.
ps. uint 加法不会 overflow:
C99 6.5/9:
A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.
1.3.3. pre pass
int x = 0; int y = 0; if (argc > 0) { x = 1; y = 2; } else { x = 2; y = 1; } int z = x + y;
ccp 无法发现 z 是常量 3.
但 gcc 的 pre pass (partial redundancy elimination) 可以发现 `z=3`
Backlinks
strlen (strlen > GCC O2 的一个 `bug`?): 这个问题的 root cause 参考 bitwise constant propagation
Backlinks
GCC Forward Propagation (GCC Forward Propagation): forward propagation (pass_forwprop) GCC Constant Propagation 类似, 但它的功能不 是替换表达式为常量, 而是替换多个表达式为一个表达式, 例如: