GCC Pass

Table of Contents

1. GCC Pass

1.1. Overview

所有的 pass 及其执行的顺序都在 passes.def 中指定.

1.2. debug options

用来 debug pass 的命令行参数有:

  1. `-fdump-{kind}-all`
  2. `-fdump-{kind}-{pass}`
  3. `-fdump-{kind}-{pass}-{options}`
  4. `-fdisable-{kind}-{pass}`
  5. `-fenable-{kind}-{pass}`
  6. `-fopt-info`
  7. `-fdump-passes`

其中:

  1. kind 可以是 `tree` 或 `rtl`, 表示 gimple pass 或 rtl pass
  2. pass 是 pass 的名字, 例如 ccp, evrp, 从代码中的 pass_data_xxx 中可以得到具体的名字, 例如:

    const pass_data pass_data_ccp = {
        GIMPLE_PASS,               /* type */
        "ccp",                     /* name */
        OPTGROUP_NONE,             /* optinfo_flags */
        TV_TREE_CCP,               /* tv_id */
        (PROP_cfg | PROP_ssa),     /* properties_required */
        0,                         /* properties_provided */
        0,                         /* properties_destroyed */
        0,                         /* todo_flags_start */
        TODO_update_address_taken, /* todo_flags_finish */
    };
    
  3. options 可以是 `details`, `graph`, `all` 等, 例如 `-fdump-tree-ccp1-details`

具体 pass 的名字, enable/disable 情况, pass 间的依赖等可以通过 `-fdump-passes` 查看, 例如:

`gcc test.c -O0 -fdump-passes` 可以看到 `-O0` 下 `tree-ccp1` 及其上级依赖 `tree-early_optimizations` 为 OFF, 所以可以通过 `gcc test.c -O0 -fenable-tree-ccp1 -fenable-tree-early_optimizations` 来单独验证 pass_ccp1

1.3. tree pass

1.3.1. pass_build_cfg

$> cat test.c.011t.cfg

;; Function foo (foo, funcdef_no=0, decl_uid=1363, symbol_order=0)

;; 1 loops found
;;
;; Loop 0
;;  header 0, latch 1
;;  depth 0, outer -1
;;  nodes: 0 1 2 3 4 5
;; 2 succs { 3 4 }
;; 3 succs { 5 }
;; 4 succs { 5 }
;; 5 succs { 1 }
foo ()
{
  float y;
  float x;
  double D.1373;
  double D.1372;
  double D.1371;
  double D.1370;

  <bb 2>:
  x = 0.0;
  y = x * 2.0e+0;
  if (y > 0.0)
    goto <bb 3>;
  else
    goto <bb 4>;

  <bb 3>:
  D.1370 = (double) y;
  D.1371 = D.1370 + 1.0e+0;
  D.1372 = (double) y;
  D.1373 = D.1371 + D.1372;
  y = (float) D.1373;
  goto <bb 5>;

  <bb 4>:
  y = 1.0e+0;

  <bb 5>:
  return;

}


<bb x> 表示 basic block, 通过 `-fdump-tree-cfg-graph` 可以 dump 出 cfg 对应的 dot 文件, 转换为 png 后为:

gimple_cfg.png

转换为 cfg (或者 basic block) 的好处是同一个 basic block 内部都是线性的指令, 容易做 local optimization

1.3.2. pass_ccp

1.3.3. pass_fowrprop

1.3.4. pass_pre

1.3.5. pass_early_vrp

1.3.6. pass_profile

1.3.7. pass_dce

1.3.8. pass_dse

1.3.9. pass_data_tree_ifcombine

bool f1(bool a, bool b) {
    if (a) {
        if (b)
            return 1;
        else
            return 0;
    }
    return 0;
}

ifcombine 可以把它变成:

bool f1(bool a, bool b) {
    return a & b;
}

`if (a) {if (b) …}` 实际上类似于 `if (a&&b)`, 因为它们都是 short-circuit, 把 `a&&b` 优化成 `a&b` 能节省 short-circuit 导致的 branch 开销, 但增加了 `a&b`的计算开销, 因为 `a&b` 不是 short-circuit, 所以 ifcombine 需要考虑 branch_cost, 当 branch_cost 较大时, 才会应用这个优化.

Backlinks

branch_cost (mtune > riscv_tune_param > branch_cost): 另外, pass_data_tree_ifcombine 也会考虑 branch_cost 的影响: 当 branch_cost 较大 时, 会把 `if (a) {if (b) {}}` 优化成 `if (a&b)`

1.3.10. pass_copy_prop

1.3.11. pass_sink_code

1.3.12. pass_nrv

1.3.14. loop

1.3.14.1. pass_loop
1.3.14.2. pass_empty_loop
1.3.14.3. pass_complete_unroll
1.3.14.4. pass_loop_prefetch
Backlinks

GCC Prefetch (GCC Prefetch): 1) 使用 pass_loop_prefetch 进行 auto prefetch

1.3.14.5. pass_ch

pass_ch 指针对 loop 的 copy header 优化, 它会把

while (cond) {
    /* ... */
}

优化成

do {
    /*  */
} while (cond)

的形式. 若首次循环时 cond 并一定成立, 则需要 copy 这个 cond 到循环的外部, 用一个额外的判断避免优化前后执行结果不一致

例如:

// 2023-03-30 10:22
#include <stdio.h>
#include <string.h>
volatile int data[3] = {1, 2, 3};
int main(int argc, char *argv[]) {
    volatile int *x = data;
    int a = 0x1;
    int b = 0x2;
    int c = 0x3;
    for (int i = 0; i < 3; i++) {
        *x = a;
        if (*x != data[0]) {
            return 1;
        }
        *x = b;
        if (*x != data[1]) {
            return 1;
        }
        *x = c;
        if (*x != data[2]) {
            return 1;
        }
    }
    return 0;
}

ch 优化前的 gimple 为:

__attribute__((access ("^1[ ]", )))
int main (int argc, char * * argv)
{
  ...
  <bb 2> [local count: 333062948]:
  goto <bb 7>; [100.00%]
  ~~~~~~~~~~~~~~~~~~~~~~
    先跳到 bb 7  以判断循环条件是否成立

  <bb 3> [local count: 805306369]:
  MEM[(volatile int *)&data] ={v} 1;
  _1 ={v} MEM[(volatile int *)&data];
  _2 ={v} data[0];
  if (_1 != _2)
    goto <bb 8>; [2.75%]
  else
    goto <bb 4>; [97.25%]

  <bb 4> [local count: 783160441]:
  MEM[(volatile int *)&data] ={v} 2;
  _3 ={v} MEM[(volatile int *)&data];
  _4 ={v} data[1];
  if (_3 != _4)
    goto <bb 8>; [2.75%]
  else
    goto <bb 5>; [97.25%]

  <bb 5> [local count: 761623526]:
  MEM[(volatile int *)&data] ={v} 3;
  _5 ={v} MEM[(volatile int *)&data];
  _6 ={v} data[2];
  if (_5 != _6)
    goto <bb 8>; [2.75%]
  else
    goto <bb 6>; [97.25%]

  <bb 6> [local count: 740678875]:
  i_16 = i_7 + 1;

  <bb 7> [local count: 1073741824]:
  if (i_7 != 3)
    goto <bb 3>; [75.00%]
    ~~~~~~~~~~~~
      bb 3 是循环体

  else
    goto <bb 8>; [25.00%]

  <bb 8> [local count: 333062949]:
  return _8;

}

应用 ch 优化后:

__attribute__((access ("^1[ ]", )))
int main (int argc, char * * argv)
{
  ...
  <bb 2> [local count: 333062948]:

  <bb 3> [local count: 805306369]:
  MEM[(volatile int *)&data] ={v} 1;
  _1 ={v} MEM[(volatile int *)&data];
  _2 ={v} data[0];

  if (_1 != _2)
    goto <bb 7>; [2.75%]
  else
    goto <bb 4>; [97.25%]
  ~~~~~~~~~~~~~~~~~~~~~~~~
    直接进入了循环体, 因为循环初始条件是成立的, 这样少了一次对循环条件的判断

  <bb 4> [local count: 783160441]:
  MEM[(volatile int *)&data] ={v} 2;
  _3 ={v} MEM[(volatile int *)&data];
  _4 ={v} data[1];
  if (_3 != _4)
    goto <bb 7>; [2.75%]
  else
    goto <bb 5>; [97.25%]

  <bb 5> [local count: 761623526]:
  MEM[(volatile int *)&data] ={v} 3;
  _5 ={v} MEM[(volatile int *)&data];
  _6 ={v} data[2];
  if (_5 != _6)
    goto <bb 7>; [2.75%]
  else
    goto <bb 6>; [97.25%]

  <bb 6> [local count: 740678876]:
  i_16 = i_18 + 1;
  if (i_16 != 3)
    goto <bb 3>; [75.00%]
  else
    goto <bb 7>; [25.00%]

  <bb 7> [local count: 333062949]:
  return _8;
}

如果把代码中的判断条件修改成:

for (int i = 1; i < argc; i++) {
}

则 pass_ch 就需要生成一个 copy header, 以避免首次循环时条件不成立的情况:

main (int argc, char * * argv)
{
  ...
  <bb 2> [local count: 113328204]:
  if (argc_12(D) > 1)
    goto <bb 3>; [97.25%]
    ~~~~~~~~~~~~~~~~~~~~
      这里通过 copy header 增加了一个额外的判断, 确定后续 while 可以修改成 do while
  else
    goto <bb 7>; [2.75%]

  <bb 3> [local count: 1044213925]:
  MEM[(volatile int *)&data] ={v} 1;
  _1 ={v} MEM[(volatile int *)&data];
  _2 ={v} data[0];
  if (_1 != _2)
    goto <bb 7>; [2.75%]
  else
    goto <bb 4>; [97.25%]

  <bb 4> [local count: 1015498041]:
  MEM[(volatile int *)&data] ={v} 2;
  _3 ={v} MEM[(volatile int *)&data];
  _4 ={v} data[1];
  if (_3 != _4)
    goto <bb 7>; [2.75%]
  else
    goto <bb 5>; [97.25%]

  <bb 5> [local count: 987571844]:
  MEM[(volatile int *)&data] ={v} 3;
  _5 ={v} MEM[(volatile int *)&data];
  _6 ={v} data[2];
  if (_5 != _6)
    goto <bb 7>; [2.75%]
  else
    goto <bb 6>; [97.25%]

  <bb 6> [local count: 960413620]:
  i_16 = i_19 + 1;
  if (argc_12(D) > i_16)
    goto <bb 3>; [97.25%]
  else
    goto <bb 7>; [2.75%]

  <bb 7> [local count: 113328205]:
  return _8;

}

另外, 即使没有 pass_ch, `while` 也会被优化成:

goto x
r:
  // loop body
x:
  if cond_meet() {goto r} else {goto out}

out:

而不是

x:
  if cond_meet() {goto r} else {goto out}
r:
  // loop body
  goto x

out:

前者比后者性能更好, 因为后者每次执行完 loop body 都需要先跳转再判断, 而前者只需要一个判断

1.3.14.6. pass_data_loop_distribution

它是 loop fusion 的反向操作, 会把

DO I = 1, N
  A(I) = B(I) + C
  D(I) = E(I) * F
ENDDO

变成

DO I = 1, N
   A(I) = B(I) + C
ENDDO
DO I = 1, N
   D(I) = E(I) * F
ENDDO

这种变换对 cache 有利. 另外, 配合 tree_loop_distribution_patterns, 可以进一步把生成的简单的 loop 变成 memset 等, 例如:

#include <stdint.h>
void matrix_mul_vect(int32_t N, int32_t *C, int32_t *A, int32_t *B) {
    int32_t i, j;
    for (i = 0; i < N; i++) {
        C[i] = 0;
    }
}

通过 loop_distribution_patterns 后生成的代码为:

matrix_mul_vect:
.LFB0:
    mv      a2,a0
    mv      a0,a1
    ble     a2,zero,.L1
    slli    a5,a2,32
    srli    a2,a5,30
    li      a1,0
    tail    memset
.L1:
    ret

1.3.15. vector

1.3.15.1. pass_lower_vector

如果 backend 不支持某个 vector 操作, 需要在 gimple opt 时把 vector 操作转换成 scalar 操作

1.3.16. pass_lim

1.3.17. pass_linear_transform

1.4. rtl pass

1.4.1. pass_expand

gimple 转换为 rtl

1.4.2. pass_combine

假设 backend 支持一个 average 指令, 可以通过 pass_combine 把计算 (a+b)/2 的两条 rtl combine 成一条 rtl (需要先在 md 中定义这个 rtl), 和 llvm 的 DAGCombiner 类似

Backlinks

GCC Backend (GCC Backend > rtl optimization): - pass_combine

sext (SEXT and ZEXT in RISCV-V > sext > sext in gcc > explicit sext): 上面的例子中, sext.w 是必须的, 但编译器可以根据值的范围判定某些显式 sext 是多余 的, 例如 gcc 的 combine 可以根据值的范围去掉不需要的 sext.w:

1.4.3. pass_rtl_dse1

1.4.4. pass_rtl_ifcvt

1.4.5. pass_thread_prologue_and_epilogue

在函数中插入的 prologue 和 epilogue. prologue 是指函数开头需要插入的一些指令, 例如分配 stack frame, 保存寄存器. epilogue 是函数返回前需要插入指令, 例如释放 stack frame, 恢复寄存器.

1.4.5.1. prologue

prologue 对应的 rtl 是由 md 中的通过 define_expand 定义的 prologue 决定的.

rtx
gen_prologue (void)
{
  rtx_insn *_val = 0;
  start_sequence ();
  {
#define FAIL return (end_sequence (), _val)
#define DONE return (_val = get_insns (), end_sequence (), _val)
#line 2202 "../.././riscv-gcc/gcc/config/riscv/riscv.md"
{
  riscv_expand_prologue ();
  DONE;
}
#undef DONE
#undef FAIL
  }
  emit_insn (const1_rtx);
  _val = get_insns ();
  end_sequence ();
  return _val;
}

riscv_expand_prologue 负责生成函数开头的:

addi sp, sp , xxx
sw ra, (4)sp
sw s0, (8)sp
## ...

对应的 rtx

/* NOTE: 这段代码负责生成函数开头的
 * addi sp, sp , xxx
 * sw ra, (4)sp
 * sw s0, (8)sp
 * ...
 * */
void riscv_expand_prologue(void) {
    struct riscv_frame_info *frame = &cfun->machine->frame;
    /* NOTE: frame->total_size 在 pass_ira 时确定下来 */
    HOST_WIDE_INT size = frame->total_size;
    unsigned mask = frame->mask;
    rtx insn;

    if (flag_stack_usage_info) current_function_static_stack_size = size;

    /* Save the registers.  */
    /* NOTE: frame->mask 指哪些寄存器需要保存, 也是 pass_ira 时确定的 */
    if ((frame->mask | frame->fmask) != 0) {
        /* NOTE: 这里并没有直接使用 frame->total_size, 是因为 total_size 很大时
         * 会无法通过一个 addi 完成, 所以可以先分配足够存放 saved reg 的小一点的
         * `step`*/
        HOST_WIDE_INT step1 = MIN(size, riscv_first_stack_step(frame));

        /* NOTE: addi sp, sp, -step1 */
        insn = gen_add3_insn(
            stack_pointer_rtx, stack_pointer_rtx, GEN_INT(-step1));
        size -= step1;
        riscv_for_each_saved_reg(size, riscv_save_reg, false, false);
    }

    frame->mask = mask; /* Undo the above fib.  */

    /* NOTE: 需要保存 fp, 例如存在 alloca  */
    if (frame_pointer_needed) {
        insn = gen_add3_insn(
            hard_frame_pointer_rtx, stack_pointer_rtx,
            GEN_INT(frame->hard_frame_pointer_offset - size));
        RTX_FRAME_RELATED_P(emit_insn(insn)) = 1;

        riscv_emit_stack_tie();
    }

    /* NOTE: frame->total_size 剩下的部分 */
    if (size > 0) {
        /* NOTE: SMALL_OPERAND 是指 12 bit 能表示的立即数 */
        if (SMALL_OPERAND(-size)) {
            insn = gen_add3_insn(
                stack_pointer_rtx, stack_pointer_rtx, GEN_INT(-size));
            RTX_FRAME_RELATED_P(emit_insn(insn)) = 1;
        } else {
            riscv_emit_move(RISCV_PROLOGUE_TEMP(Pmode), GEN_INT(-size));
            emit_insn(gen_add3_insn(
                stack_pointer_rtx, stack_pointer_rtx,
                RISCV_PROLOGUE_TEMP(Pmode)));

            /* Describe the effect of the previous instructions.  */
            insn = plus_constant(Pmode, stack_pointer_rtx, -size);
            insn = gen_rtx_SET(stack_pointer_rtx, insn);
            riscv_set_frame_expr(insn);
        }
    }
}
1.4.5.2. epilogue
1.4.5.3. backtrace
#0  gen_prologue () at ../.././riscv-gcc/gcc/config/riscv/riscv.md:2202
#1  0x000000000153aada in target_gen_prologue () at ../.././riscv-gcc/gcc/config/riscv/sync.md:558
#2  0x0000000000d017d6 in make_prologue_seq () at ../.././riscv-gcc/gcc/function.c:5801
#3  0x0000000000d01d8b in thread_prologue_and_epilogue_insns () at ../.././riscv-gcc/gcc/function.c:6019
#4  0x0000000000d02cbb in rest_of_handle_thread_prologue_and_epilogue () at ../.././riscv-gcc/gcc/function.c:6510
#5  0x0000000000d02ecf in (anonymous namespace)::pass_thread_prologue_and_epilogue::execute (this=0x23416c0) at ../.././riscv-gcc/gcc/function.c:6586
#6  0x0000000000fdd7d1 in execute_one_pass (pass=0x23416c0) at ../.././riscv-gcc/gcc/passes.c:2567
#7  0x0000000000fddb1e in execute_pass_list_1 (pass=0x23416c0) at ../.././riscv-gcc/gcc/passes.c:2656
#8  0x0000000000fddb4f in execute_pass_list_1 (pass=0x2341480) at ../.././riscv-gcc/gcc/passes.c:2657
#9  0x0000000000fddb4f in execute_pass_list_1 (pass=0x2340220) at ../.././riscv-gcc/gcc/passes.c:2657
#10 0x0000000000fddbab in execute_pass_list (fn=0x7ffff777e000, pass=0x233c270) at ../.././riscv-gcc/gcc/passes.c:2667
#11 0x0000000000b6f617 in cgraph_node::expand (this=0x7ffff7780000) at ../.././riscv-gcc/gcc/cgraphunit.c:1830
#12 0x0000000000b6fd15 in cgraph_order_sort::process (this=0x2333788) at ../.././riscv-gcc/gcc/cgraphunit.c:2069
#13 0x0000000000b6ffce in output_in_order () at ../.././riscv-gcc/gcc/cgraphunit.c:2137
#14 0x0000000000b705a6 in symbol_table::compile (this=0x7ffff7670000) at ../.././riscv-gcc/gcc/cgraphunit.c:2355
#15 0x0000000000b709d2 in symbol_table::finalize_compilation_unit (this=0x7ffff7670000) at ../.././riscv-gcc/gcc/cgraphunit.c:2539
#16 0x000000000110d05f in compile_file () at ../.././riscv-gcc/gcc/toplev.c:482
#17 0x00000000011100c6 in do_compile () at ../.././riscv-gcc/gcc/toplev.c:2201
#18 0x00000000011103e8 in toplev::main (this=0x7fffffffc176, argc=14, argv=0x7fffffffc288) at ../.././riscv-gcc/gcc/toplev.c:2340
#19 0x0000000001bf4bf5 in main (argc=14, argv=0x7fffffffc288) at ../.././riscv-gcc/gcc/main.c:39

1.4.6. pass_ira

  1. 寄存器分配
  2. 计算 frame_info, 例如计算 frame size 和需要保存和恢复的寄存器 (mask)
1.4.6.1. frame_info

taget 需要定义 INITIAL_ELIMINATION_OFFSET 这个宏, 用来计算 frame info.

static void riscv_compute_frame_info(void) {
    struct riscv_frame_info *frame;
    HOST_WIDE_INT offset;
    bool interrupt_save_prologue_temp = false;
    unsigned int regno, i, num_x_saved = 0, num_f_saved = 0;

    frame = &cfun->machine->frame;
    memset(frame, 0, sizeof(*frame));

    /* NOTE: naked function */
    if (!cfun->machine->naked_p) {
        for (regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++)
            /* NOTE: riscv_save_reg_p 用来确定寄存器是否需要保存, 例如:
             * 如果 regno 是 s0~ */
            if (riscv_save_reg_p(regno))
                frame->mask |= 1 << (regno - GP_REG_FIRST), num_x_saved++;

        /* Find out which FPRs we need to save.  This loop must iterate over
           the same space as its companion in riscv_for_each_saved_reg.  */
        if (TARGET_HARD_FLOAT)
            for (regno = FP_REG_FIRST; regno <= FP_REG_LAST; regno++)
                if (riscv_save_reg_p(regno))
                    frame->fmask |= 1 << (regno - FP_REG_FIRST), num_f_saved++;
    }

    /* NOTE: 计算 offset 时是按照 stack 从低到高的顺序 */
    /* NOTE: 1. 栈顶是 outgoing args, 即 spill 到栈上的 args */
    offset = RISCV_STACK_ALIGN(crtl->outgoing_args_size);
    /* NOTE: 2. 然后是 local variable */
    offset += RISCV_STACK_ALIGN(get_frame_size());
    /* The virtual frame pointer points above the local variables. */
    frame->frame_pointer_offset = offset;
    /* NOTE: 3. 需要保存的 fp */
    if (frame->fmask)
        offset += RISCV_STACK_ALIGN(num_f_saved * UNITS_PER_FP_REG);
    frame->fp_sp_offset = offset - UNITS_PER_FP_REG;
    /* NOTE: 4. 需要保存的 gp */
    if (frame->mask) {
        unsigned x_save_size = RISCV_STACK_ALIGN(num_x_saved * UNITS_PER_WORD);
        offset += x_save_size;
    }
    frame->gp_sp_offset = offset - UNITS_PER_WORD;
    frame->hard_frame_pointer_offset = offset;
    frame->total_size = offset;
}
1.4.6.1.1. naked

naked function, 是指不需要 gcc 生成 prolog 和 epilogue 的函数, 例如一些完全用内联汇编写的函数.

1.4.6.1.2. riscv_save_reg_p
static bool riscv_save_reg_p(unsigned int regno) {
    /* NOTE: global_regs 是指 global regizer variable
     * https://gcc.gnu.org/onlinedocs/gcc/Global-Register-Variables.html
     * */
    bool call_saved = !global_regs[regno] && !call_used_or_fixed_reg_p(regno);
    /* NOTE: df_regs_ever_live_p 指该 reg 在当前函数修改过, 例如, 若函数是 leaf
     * function, ra 会因为没有修改过而不需要保存 */
    bool might_clobber =
        crtl->saves_all_registers || df_regs_ever_live_p(regno);

    if (call_saved && might_clobber) return true;

    if (regno == HARD_FRAME_POINTER_REGNUM && frame_pointer_needed) return true;

    if (regno == RETURN_ADDR_REGNUM && crtl->calls_eh_return) return true;

    /* ... */
    return false;
}

inline bool call_used_or_fixed_reg_p(unsigned int regno) {
    return fixed_regs[regno] || this_target_hard_regs->x_call_used_regs[regno];
}

FIXED_REGISTER 和 CALL_USED_REGISTERS 为 target 定义的, 例如:

#define FIXED_REGISTERS                                                       \
    { /* General registers.  */                                               \
        1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,  \
            0, 0, 0, 0, 0, 0, 0, 0, 0, /* Floating-point registers.  */       \
            0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
            0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* Others.  */                      \
            1, 1                                                              \
    }

/* a0-a7, t0-t6, fa0-fa7, and ft0-ft11 are volatile across calls.
   The call RTLs themselves clobber ra.  */

#define CALL_USED_REGISTERS                                                   \
    { /* General registers.  */                                               \
        1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0,  \
            0, 0, 0, 0, 0, 1, 1, 1, 1, /* Floating-point registers.  */       \
            1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, \
            0, 0, 0, 0, 0, 0, 1, 1, 1, 1, /* Others.  */                      \
            1, 1                                                              \
    }

FIXED_REGISTERS 是有特殊意义的 reg, 任何时候不能被随便修改, 例如 zero, sp, gp, tp

CALL_USED_REGISTERS 包含 FIXED_REGISTERS, 以及其它在函数调用时可能被 caller 修改的 reg, 基本上 (CALL_USED_REGISTERS - FIXED_REGISTERS) 即 caller saved reg, 而 ~CALL_USED_REGISTERS 即 callee saved reg.

另外 gcc 可以通过运行时参数 `-ffixed-reg` 控制哪些 reg 属于 FIXED_REGISTERS. FIXED_REGISTERS 不会被 ira 分配, 也不会被 prologue 保存

1.4.6.1.3. backtrace
#0  riscv_save_reg_p (regno=0) at ../.././riscv-gcc/gcc/config/riscv/riscv.c:3636
#1  0x0000000001544432 in riscv_compute_frame_info () at ../.././riscv-gcc/gcc/config/riscv/riscv.c:3766
#2  0x000000000154483f in riscv_initial_elimination_offset (from=64, to=2) at ../.././riscv-gcc/gcc/config/riscv/riscv.c:3850
#3  0x0000000001062641 in set_initial_elim_offsets () at ../.././riscv-gcc/gcc/reload1.c:3769
#4  0x000000000105c432 in calculate_elim_costs_all_insns () at ../.././riscv-gcc/gcc/reload1.c:1559
#5  0x0000000000e8eee5 in ira_costs () at ../.././riscv-gcc/gcc/ira-costs.c:2296
#6  0x0000000000e85524 in ira_build () at ../.././riscv-gcc/gcc/ira-build.c:3426
#7  0x0000000000e7b6ae in ira (f=0x0) at ../.././riscv-gcc/gcc/ira.c:5655
#8  0x0000000000e7bf6b in (anonymous namespace)::pass_ira::execute (this=0x23413c0) at ../.././riscv-gcc/gcc/ira.c:5978
#9  0x0000000000fdd7d1 in execute_one_pass (pass=0x23413c0) at ../.././riscv-gcc/gcc/passes.c:2567
#10 0x0000000000fddb1e in execute_pass_list_1 (pass=0x23413c0) at ../.././riscv-gcc/gcc/passes.c:2656
#11 0x0000000000fddb4f in execute_pass_list_1 (pass=0x2340220) at ../.././riscv-gcc/gcc/passes.c:2657
#12 0x0000000000fddbab in execute_pass_list (fn=0x7ffff777e000, pass=0x233c270) at ../.././riscv-gcc/gcc/passes.c:2667
#13 0x0000000000b6f617 in cgraph_node::expand (this=0x7ffff7780000) at ../.././riscv-gcc/gcc/cgraphunit.c:1830
#14 0x0000000000b6fd15 in cgraph_order_sort::process (this=0x2333788) at ../.././riscv-gcc/gcc/cgraphunit.c:2069
#15 0x0000000000b6ffce in output_in_order () at ../.././riscv-gcc/gcc/cgraphunit.c:2137
#16 0x0000000000b705a6 in symbol_table::compile (this=0x7ffff7670000) at ../.././riscv-gcc/gcc/cgraphunit.c:2355
#17 0x0000000000b709d2 in symbol_table::finalize_compilation_unit (this=0x7ffff7670000) at ../.././riscv-gcc/gcc/cgraphunit.c:2539
#18 0x000000000110d05f in compile_file () at ../.././riscv-gcc/gcc/toplev.c:482
#19 0x00000000011100c6 in do_compile () at ../.././riscv-gcc/gcc/toplev.c:2201
#20 0x00000000011103e8 in toplev::main (this=0x7fffffffc176, argc=14, argv=0x7fffffffc288) at ../.././riscv-gcc/gcc/toplev.c:2340
#21 0x0000000001bf4bf5 in main (argc=14, argv=0x7fffffffc288) at ../.././riscv-gcc/gcc/main.c:39
Backlinks

GCC Backend (GCC Backend > register allocation): pass_ira 和 pass_reload

GCC Target Hook (GCC Target Hook > register 相关): pass_ira 会使用这些 macro

1.4.7. pass_reload

1.4.8. pass_sched

1.4.10. pass_final

生成汇编

Backlinks

GCC Backend (GCC Backend > code emission): rtl 最终由 pass_final 生成 assembly

1.5. others

Backlinks

GCC (GCC > Pass): Pass

Author: [email protected]
Date: 2022-04-15 Fri 14:04
Last updated: 2023-11-30 Thu 19:37

知识共享许可协议