binutils

1. binutils

1. binutils

分析基于早期 riscv bintuils 的代码, 有些代码可能有出入. 最近的代码中 mips_ip 改为 riscv_ip, riscv_builtin_opcodes 改为 riscv_opcodes

1.1. opcodes

as/objdump/gdb 都需要用到:

opcode 的汇编格式
opcode 的编码信息

因此 binutils 中使用 opcodes 统一实现, 它定义了 opcode 的格式以及处理它们的函数, 并提供了一个机制可以简化新指令的添加

1.1.1. riscv_opcodes

在 as/objdump 中添加新指令时, 大部分工作只需要修改 riscv_builtin_opcodes (除非添加了新的指令格式, 需要实现对应的 encode/decode 方法), 因为 as/objdump 关于 encode/decode 的操作都是针对该数据结构完成的.

as/objdump/gdb 都使用了 opcodes, qemu/spike/gem5 则定义了它们自己的一套类似的机制

注意这个结构体中的成员的顺序并不是随意的, 因为无论汇编或反汇编时都需要能快速的查找这个表中的数据, 参考 op_hash 和 mips_hash

const struct riscv_opcode riscv_builtin_opcodes[] = {
    {"unimp", "I", "", 0, 0xffff, match_opcode, 0},
    {"nop", "I", "", MATCH_ADDI, MASK_ADDI | MASK_RD | MASK_RS1 | MASK_IMM,
     match_opcode, INSN_ALIAS},
    {"li", "I", "d,j", MATCH_ADDI, MASK_ADDI | MASK_RS1, match_opcode,
     INSN_ALIAS | WR_xd}, /* addi */
    {"li", "I", "d,I", 0, (int)M_LI, match_never, INSN_MACRO},
    {"mv", "I", "d,s", MATCH_ADDI, MASK_ADDI | MASK_IMM, match_opcode,
     INSN_ALIAS | WR_xd | RD_xs1},
    {"move", "I", "d,s", MATCH_ADDI, MASK_ADDI | MASK_IMM, match_opcode,
     INSN_ALIAS | WR_xd | RD_xs1},
    {"b", "I", "p", MATCH_BEQ, MASK_BEQ | MASK_RS1 | MASK_RS2, match_opcode,
     0}, /* beq 0,0 */
    {"andi", "I", "d,s,j", MATCH_ANDI, MASK_ANDI, match_opcode, WR_xd | RD_xs1},
    {"and", "I", "d,s,t", MATCH_AND, MASK_AND, match_opcode,
     WR_xd | RD_xs1 | RD_xs2},
    {"and", "I", "d,s,j", MATCH_ANDI, MASK_ANDI, match_opcode,
     INSN_ALIAS | WR_xd | RD_xs1},
    /* ... */
    {"addi", "I", "d,s,j", MATCH_ADDI, MASK_ADDI, match_opcode, WR_xd | RD_xs1},
    {"add", "I", "d,s,t", MATCH_ADD, MASK_ADD, match_opcode,
     WR_xd | RD_xs1 | RD_xs2},
    {"add", "I", "d,s,t,0", MATCH_ADD, MASK_ADD, match_opcode,
     WR_xd | RD_xs1 | RD_xs2},
    {"add", "I", "d,s,j", MATCH_ADDI, MASK_ADDI, match_opcode,
     INSN_ALIAS | WR_xd | RD_xs1},
    // ...
};

1.1.1.1. INSN

{"addi", "I", "d,s,j", MATCH_ADDI, MASK_ADDI, match_opcode, WR_xd | RD_xs1}

"I" 表示 I 指令
"d,s,j" 表示对应汇编的的参数是 rd, rs1, imm 的形式

在汇编时, `mips_ip` 函数会根据它决定汇编中的参数如何编码到二进制指令. 在反汇编时, `print_insn_args` 函数会根据它决定如何把二进制指令中的数据解码成汇编的参数. 另外, 这两个函数需要用到 OP_MASK_XXX (例如 OP_MASK_RS1) 以确定汇编参数模板中的 `d`, `s` 等与二进制指令的 bit 的对应关系
MATCH_ADDI 为 0x13, 即 b0010011, 代表 addi 指令的低 7 位 opcode 的值
MASK_ADDI 为 0x707f, 即 b 111 00000 1111111, 是指用这个 mask 去掉指令中变化的部分: rd, rs1, imm
match_opcode 是一个函数, 目地是判断一条机器指令与这个模板是否匹配, 它实现是 `return (insn & op->mask) == op->match`

这条模板有两个作用:

汇编时根据 `addi` 匹配到该模板 (使用 op_hash), 根据 MATCH_ADDI 和 MASK_ADDI 填充二进制指令中不变的部分, 根据 `d,s,j` 和 OP_MASK_RS1/RD 和汇编指令的参数填充二进制指令中汇编参数的部分
反汇编时根据 MATCH_ADDI, MASK_ADDI 和 match_opcode 匹配到该模板 (使用 mips_hash), 根据 `d,s,j` 和 OP_MASK 生成汇编指令的参数

1.1.1.2. INSN_ALIAS

{"li", "I", "d,j", MATCH_ADDI, MASK_ADDI | MASK_RS1, match_opcode, INSN_ALIAS | WR_xd}

INSN_ALIAS 表示 li 是一个 alias. 实际上后面的 MATCH_ADDI, MASK_ADDI | MASK_RS1 会让它匹配到 addi rd, zero, imm, 所以 objdump 时针对这个指令会输出 li rd, imm 而不是 addi rd, zero, imm, 因为 li 指令在列表中比 addi 更靠前. 通过 objdump -Mno-aliases 可以忽略掉所有带 INSN_ALIAS 标记的模板

1.1.1.3. INSN_MACRO

{"li", "I", "d,I", 0, (int)M_LI, match_never, INSN_MACRO}

INSN_ALIAS 和 INSN_MACRO 共同构成了 as 支持的伪指令.

这里对应的 li 是用来加载 32 位立即数的伪指令, 通过 INSN_MACRO 标识. 由于它可能会产生两条指令 (lui, addi), 所以无法在反汇编时根据一条机器指令匹配到这个模板, 所以用 match_never 跳过它, 所以 objdump 无法还原某些伪指令.

虽然 macro 对反汇编没用, 但汇编时还是会使用它.

1.1.2. encode

as 使用 opcodes 进行 encode 时需要使用 riscv_opcodes 结构体, 参考 md_assemble

1.1.3. decode

objdump 会使用 opcodes 中提供的 print_insn_riscv 等函数

1.1.3.1. print_insn_riscv

int print_insn_riscv(bfd_vma memaddr, struct disassemble_info *info) {
    uint16_t i2;
    /* insn_t 即 uint64_t, 是指 riscv 指令最大长度为 64bit */
    insn_t insn = 0;
    bfd_vma n;
    int status;

    /* disassemble 时先一次性读入 2 bytes, 然后根据 riscv_insn_length 决定是否还
     * 需要读入更多的数据, riscv 指令长度是 2~8 bytes, 但通常是 4 bytes.
     *
     * 因为 riscv 指令是 little-endian (低位保存在低地址), 所以
     * riscv_insn_length 实际是根据指令低 7 位 opcode 来判断它是否是 16 位
     * compact 指令, 32 位指令或者 48/64 位扩展指令
     *
     * */
    for (n = 0; n < sizeof(insn) && n < riscv_insn_length(insn); n += 2) {
        status =
            (*info->read_memory_func)(memaddr + n, (bfd_byte *)&i2, 2, info);
        if (status != 0) {
            if (n > 0) /* Don't fail just because we fell off the end. */
                break;
            (*info->memory_error_func)(status, memaddr, info);
            return status;
        }

        i2 = bfd_getl16(&i2);
        /* little endian */
        insn |= (insn_t)i2 << (8 * n);
    }

    /* 参考 mips, 但针对 riscv 修改的版本, 因为 riscv 和 mips 很相似 */
    return print_insn_mips(memaddr, insn, info);
}

1.1.3.2. print_insn_mips

static int print_insn_mips(
    bfd_vma memaddr, insn_t word, disassemble_info *info) {
    const struct riscv_opcode *op;
    static bfd_boolean init = 0;
    static const char *extension = NULL;
    static const struct riscv_opcode *mips_hash[OP_MASK_OP + 1];
    struct riscv_private_data *pd;
    int insnlen;

    /* 这里 mips_hash 并不是一个 hash table, 它实际相当于一个跳转表:
     *
     * riscv_opcodes 一共有 NUMOPCODES 条模板, 匹配 insn 时需要从头依次去查看
     * insn 是否与当前模板匹配, 为了避免从头查找, 把所有 riscv_opcodes 按 低 7
     * 位 opcode 统计每一次出现的位置, 每次查找是先根据 opcode 跳到该位置后再线
     * 性查找
     *
     *
     * */
    if (!init) {
        unsigned int i;
        unsigned int e_flags = elf_elfheader(info->section->owner)->e_flags;
        extension = riscv_elf_flag_to_name(EF_GET_RISCV_EXT(e_flags));

        for (i = 0; i <= OP_MASK_OP; i++)
            for (op = riscv_opcodes; op < &riscv_opcodes[NUMOPCODES]; op++)
                if (i == ((op->match >> OP_SH_OP) & OP_MASK_OP)) {
                    mips_hash[i] = op;
                    break;
                }

        init = 1;
    }

    insnlen = riscv_insn_length(word);

    /* ... */

    op = mips_hash[(word >> OP_SH_OP) & OP_MASK_OP];
    if (op != NULL) {
        for (; op < &riscv_opcodes[NUMOPCODES]; op++) {
            if ((op->match_func)(op, word) &&
                /* 处理 -Mno-aliases 选项 */
                !(no_aliases && (op->pinfo & INSN_ALIAS)) &&
                !(op->subset[0] == 'X' && strcmp(op->subset, extension))) {
                (*info->fprintf_func)(info->stream, "%s", op->name);
                /* print_insn_args 会打印出后续的参数, 根据模板中的 "d,s" 形式的
                 * fmt string */
                print_insn_args(op->args, word, memaddr, info);
                if (pd->print_addr != (bfd_vma)-1) {
                    info->target = pd->print_addr;
                    (*info->fprintf_func)(info->stream, " # ");
                    (*info->print_address_func)(info->target, info);
                    pd->print_addr = -1;
                }
                return insnlen;
            }
        }
    }

    /* Handle undefined instructions.  */
    info->insn_type = dis_noninsn;
    (*info->fprintf_func)(info->stream, "0x%llx", (unsigned long long)word);
    return insnlen;
}

1.1.3.3. print_insn_args

static void print_insn_args(
    const char *d, insn_t l, bfd_vma pc, disassemble_info *info) {
    struct riscv_private_data *pd = info->private_data;
    /* 根据 riscv isa, rs1/rs2/rd 的位置和长度在所有类型的 32 位长度的指令中都是固定的:
     *
     * OP_SH_RS1 为 15, OP_MASK_RS1 为 0x1f (b11111), 表示 rs1 在低 15 位开始的连
     * 续 5 bit. 而 rd 在低 7 位开始的连续 5 bit
     *
     * */
    int rs1 = (l >> OP_SH_RS1) & OP_MASK_RS1;
    int rd = (l >> OP_SH_RD) & OP_MASK_RD;

    if (*d != '\0') (*info->fprintf_func)(info->stream, "\t");

    /* 解析模板中的 args fmt, 例如 addi 的 "d,s,j" */
    for (; *d != '\0'; d++) {
        switch (*d) {
            case ',':
            case '(':
            case ')':
            case '[':
            case ']':
                (*info->fprintf_func)(info->stream, "%c", *d);
                break;

            case 'b':
            case 's':
                /* s 代表 rs1 */
                (*info->fprintf_func)(info->stream, "%s", mips_gpr_names[rs1]);
                break;

            case 't':
                /* t 代表 rs2 */
                (*info->fprintf_func)(
                    info->stream, "%s",
                    mips_gpr_names[(l >> OP_SH_RS2) & OP_MASK_RS2]);
                break;

            case 'j':
                if ((l & MASK_ADDI) == MATCH_ADDI)
                    maybe_print_address(pd, rs1, EXTRACT_ITYPE_IMM(l));
                /*
                 * j 代表 I 指令的立即数, 因为立即数在 rv32i 针对不同指令有不同
                 * 的编码方式. 通过 EXTRACT_{I,S,SB,U,UJ}TYPE_IMM(l) 从 insn 中
                 * 获得立即数 */
                (*info->fprintf_func)(
                    info->stream, "%d", (int)EXTRACT_ITYPE_IMM(l));
                break;

            case 'd':
                if ((l & MASK_AUIPC) == MATCH_AUIPC)
                    pd->hi_addr[rd] =
                        pc + (EXTRACT_UTYPE_IMM(l) << RISCV_IMM_BITS);
                else if ((l & MASK_LUI) == MATCH_LUI)
                    pd->hi_addr[rd] = EXTRACT_UTYPE_IMM(l) << RISCV_IMM_BITS;
                /*
                 * 假设前面从 insn 中读到 rd 值为 2, 则 mips_gpr_names[rd] 为 x2
                 * 或 sp, 通过 objdump -M gpr-names=numeric 或 -M gpr-names=32
                 * 来切换两种命令方式 */
                (*info->fprintf_func)(info->stream, "%s", mips_gpr_names[rd]);
                break;

            /* ... */

            default:
                /* xgettext:c-format */
                (*info->fprintf_func)(
                    info->stream,
                    _("# internal error, undefined modifier (%c)"), *d);
                return;
        }
    }
}

Backlinks

GDB Target Arch (GDB Target Arch > porting new arch > porting): gdb 的 disass 依赖 opcodes. core, exec 数据的读取依赖 bfd

Spike (Spike > interpreter > register): spike 通过 riscv.mk.in 和 encoding.h 实现了和 opcodes 类似的功能.

decodetree (QEMU TCG > decodetree): qemu 的 decodetree 和 opcodes 功能类似

1.2. as

1.2.1. md_assemble

void md_assemble(char *str) {
    struct mips_cl_insn insn;

    imm_expr.X_op = O_absent;
    offset_expr.X_op = O_absent;
    imm_reloc = BFD_RELOC_UNUSED;
    offset_reloc = BFD_RELOC_UNUSED;

    mips_ip(str, &insn);

    if (insn.insn_mo->pinfo == INSN_MACRO)
        macro(&insn);
    else {
        if (imm_expr.X_op != O_absent)
            append_insn(&insn, &imm_expr, imm_reloc);
        else if (offset_expr.X_op != O_absent)
            append_insn(&insn, &offset_expr, offset_reloc);
        else
            append_insn(&insn, NULL, BFD_RELOC_UNUSED);
    }
}

1.2.2. mips_ip

static void mips_ip(char *str, struct mips_cl_insn *ip) {
    char *s;
    const char *args;
    char c = 0;
    struct riscv_opcode *insn;
    char *argsStart;
    unsigned int regno;
    char save_c = 0;
    int argnum;
    unsigned int rtype;
    const struct percent_op_match *p;

    /* op_hash 是一个以 name 为 key 的 hash table, 用于快速的根据汇编指令的名字
     * (如 addi) 找到对应的模板. 这里用的是和 objdump 相同的一套模板. 但由于存在
     * 名字相同而参数不同的模板, 所以这里 op_hash 保存的是名字对应的第一个模板,
     * 后续在查找时会依次处理名字匹配的所有模板*/
    insn = (struct riscv_opcode *)hash_find(op_hash, str);

    argsStart = s;
    for (;;) {
        bfd_boolean ok = TRUE;

        /* create_insn 会把 insn 的 opcode (op->match) 先填充到 ip 中, 然后后续
         * 的代码要填充 args */
        create_insn(ip, insn);
        argnum = 1;
        for (args = insn->args;; ++args) {
            s += strspn(s, " \t");
            /* args 是模板参数, s 是实际参数 */
            switch (*args) {
                case '\0': /* end of args */
                    if (*s == '\0') return;
                    break;

                case ',':
                    ++argnum;
                    if (*s++ == *args) continue;
                    s--;
                    break;

                case '(':
                case ')':
                case '[':
                case ']':
                    if (*s++ == *args) continue;
                    break;

                case 'd': /* destination register */
                case 's': /* source register */
                case 't': /* target register */
                    /* s 是汇编中的对应于模板参数位置的寄存器的名字, 例如 s0, x0
                     * 等, reg_lookup 是找到 s0 对应的 regno: 2 */
                    ok = reg_lookup(&s, RTYPE_NUM | RTYPE_GP, &regno);
                    if (ok) {
                        c = *args;
                        if (*s == ' ') ++s;

                        /* Now that we have assembled one operand, we use the
                         * args string to figure out where it goes in the
                         * instruction.  */
                        switch (c) {
                            case 's':
                                /* INSERT_OPERAND 先根据 OP_MASK_RS1 和
                                 * OP_SH_RS1 把regno 写到 ip 的特定位置 */
                                INSERT_OPERAND(RS1, *ip, regno);
                                break;
                            case 'd':
                                INSERT_OPERAND(RD, *ip, regno);
                                break;
                            case 't':
                                INSERT_OPERAND(RS2, *ip, regno);
                                break;
                        }
                        continue;
                    }
                    break;

                case 'j': /* sign-extended immediate */
                    offset_reloc = BFD_RELOC_RISCV_LO12_I;
                    p = percent_op_itype;
                    goto alu_op;

                alu_op:
                    if (!my_getSmallExpression(
                            &offset_expr, &offset_reloc, s, p)) {
                        normalize_constant_expr(&offset_expr);
                        if (offset_expr.X_op != O_constant ||
                            (*args == '0' && offset_expr.X_add_number != 0) ||
                            offset_expr.X_add_number >=
                                (signed)RISCV_IMM_REACH / 2 ||
                            offset_expr.X_add_number <
                                -(signed)RISCV_IMM_REACH / 2)
                            break;
                    }
                    /* offset_expr.X_add_number 会是汇编中的立即数, offset_expr
                     * 是一个全局变量, 后续 md_assemble 中的 append_insn 会把它
                     * 编码到指令中*/
                    s = expr_end;
                    continue;
                /* ... */
                default:
                    as_bad(_("bad char = '%c'\n"), *args);
                    internalError();
            }
            break;
        }
        /* 当前模板的参数与指令不匹配, 则继续处理下一条同名的模板, 所以在
         * riscv_opcodes 中相同名字的模板需要放在一起 */
        if (insn + 1 < &riscv_opcodes[NUMOPCODES] &&
            !strcmp(insn->name, insn[1].name)) {
            ++insn;
            s = argsStart;
            insn_error = _("illegal operands");
            continue;
        }
        if (save_c) *(--argsStart) = save_c;
        insn_error = _("illegal operands");
        return;
    }
}

1.2.3. macro

mips_ip 返回后, insn 相应的位已经被赋值, imm_expr, offset_expr 也被赋值, 如果 insn 不是 macro, 则 imm_expr, offset_expr 会通过 append_insn 直接编码在 insn 或保留 symbol 和 reloc_type 留给 link editor 处理. 但如果 insn 是一个 macro, 则需要通过 macro 函数单独处理, 以展开成多条 insn.

static void macro(struct mips_cl_insn *ip) {
    unsigned int rd, rs1, rs2;
    int mask;

    /* rd, rs1, rs2, imm_expr, offset_expr 不再需要从汇编中解析, 因为前面已经解
     * 析完了 */

    rd = (ip->insn_opcode >> OP_SH_RD) & OP_MASK_RD;
    rs1 = (ip->insn_opcode >> OP_SH_RS1) & OP_MASK_RS1;
    rs2 = (ip->insn_opcode >> OP_SH_RS2) & OP_MASK_RS2;
    mask = ip->insn_mo->mask;

    /* 以 li 为例:
     * {"li",        "I",   "d,I",  0,    (int) M_LI,  match_never, INSN_MACRO
     * },
     * */
    switch (mask) {
        case M_LI:
            /* load_const 会变成两条指令: lui & addi */
            load_const(rd, &imm_expr);
            break;

        case M_LA:
        case M_LLA:
            /* Load the address of a symbol into a register. */
            if (!IS_SEXT_32BIT_NUM(offset_expr.X_add_number))
                as_bad(_("offset too large"));

            if (offset_expr.X_op == O_constant)
                load_const(rd, &offset_expr);
            else if (is_pic && mask == M_LA) /* Global PIC symbol */
                pcrel_load(
                    rd, rd, &offset_expr, LOAD_ADDRESS_INSN,
                    BFD_RELOC_RISCV_GOT_HI20, BFD_RELOC_RISCV_GOT_LO12);
            else /* Local PIC symbol, or any non-PIC symbol */
                pcrel_load(
                    rd, rd, &offset_expr, "addi", BFD_RELOC_RISCV_PCREL_HI20,
                    BFD_RELOC_RISCV_PCREL_LO12_I);
            break;

        /* ... */
        case M_JUMP:
            rd = 0;
            goto do_call;
        case M_CALL:
            rd = LINK_REG;
        do_call:
            rs1 = reg_lookup_assert("t0", RTYPE_GP);
            riscv_call(rd, rs1, &offset_expr, offset_reloc);
            break;

        default:
            as_bad(_("Macro %s not implemented"), ip->insn_mo->name);
            break;
    }
}

static void load_const(int reg, expressionS *ep) {
    if (rv64 && !IS_SEXT_32BIT_NUM(ep->X_add_number)) {
        /* 64 bit */
        /* ... */
    } else {
        /* 32 bit */
        int hi_reg = ZERO;
        /* NOTE: 这里的处理需要参考 RISCV_CONST_HIGH_PART */
        int32_t hi = ep->X_add_number & (RISCV_IMM_REACH - 1);
        hi = hi << (32 - RISCV_IMM_BITS) >> (32 - RISCV_IMM_BITS);
        hi = (int32_t)ep->X_add_number - hi;
        /* 这里会对应两条指令:
         * lui rd, %hi(imm)
         * addi rd, zero, %lo(imm)
         * */
        if (hi) {
            /* macro_build 和 mips_ip 类似 */
            macro_build(ep, "lui", "d,u", reg, BFD_RELOC_RISCV_HI20);
            hi_reg = reg;
        }

        if ((ep->X_add_number & (RISCV_IMM_REACH - 1)) || hi_reg == ZERO)
            macro_build(
                ep, ADD32_INSN, "d,s,j", reg, hi_reg, BFD_RELOC_RISCV_LO12_I);
    }
}

static void riscv_call(
    int destreg, int tempreg, expressionS *ep, bfd_reloc_code_real_type reloc) {
    macro_build(ep, "auipc", "d,u", tempreg, reloc);
    macro_build(NULL, "jalr", "d,s", destreg, tempreg);
}

riscv_call 生成了 `auipc+jalr`, 因为 assembler 无法知道跳转的范围有多大, 用 auipc+jalr 是最保险的做法, 在 link 阶段通过 linker relaxation 有可能会把它换成 jal

1.2.4. append_insn

append_insn 主要的工作是根据 reloc_type 决定 imm 或 offset 如何处理:

使用常量的指令可以直接把 constant 编码到指令中, 只是需要考虑不同的类型 (I,S,U) 对 imm 的编码不太一样
指令使用了编译时可知的 symbol, 在 riscv_ip 时无法直接编码 (因为处理当前 insn 时 symbol 地址还没计算出来), 但 as 后续可以 fixup
as 时有些符号无法 fixup (例如外部符号或跨 section 的符号), 需要在链接阶段由处理
还有一个例外是 branch 指令: 如果 branch target 是 constant 但过大或者是 symbol, 会通过 relax branch 把它替换成其它指令

1.2.5. relax branch

当 branch 指令的 target 在编译时不确定, 或者确定的但过远, as 会通过 relax branch 把它替换成其它的指令 (例如 blt 可能会被换成 bge + jal)

    # riscv64-unknown-elf-as -march=rv64gcv  test.s -o test.o
.extern hello
    .global _start

_start:
    blt a0,a1,hello
    blt a0,a1,1f

    .rept   1024
    .word 1
    .endr
1:

objdump 的结果为:

0000000000000000 <_start>:
       0:       00b55463                bge     a0,a1,8 <_start+0x8>
       4:       ffdff06f                j       0 <_start>
                        4: R_RISCV_JAL  hello
       8:       00b55463                bge     a0,a1,10 <_start+0x10>
       c:       0040106f                j       1010 <.L1^B1>
                        c: R_RISCV_JAL  .L1^B1
...

在 `append_insn` 时, 针对 BFD_RELOC_12_PCREL (即 R_RISCV_BRANCH) 等会记录一个 relax_substateT, 保存着原始的指令的信息(例如 blt) 以及 imm 的信息 (例如多大的 imm), 通过 `add_relaxed_insn` 记录到 frag.

最后通过 `md_convert_frag_branch` 完成指令的转换, 例如把 `blt` 替换成 `bge+jal`

relax branch 的作用与 Linker Relaxation 类似, 但 linker relaxation 并没有处理 relax branch. 一种可能的 linker relaxation 中实现方法是: 让 blt 默认生成 `bge+jal`, 然后由 linker_relaxation 尝试把它替换回 `blt`

1.2.6. frag

riscv_ip 生成指令的编码 (有可能是 partial 的) 暂存在 `insn->insn_code` 中, `append_insn` 会在 `install_insn` 时把它写到一个 frag 中, 最后的 `write_object_file` 会把所有 frag 写到最终的 object 文件中.

每个 frag 的大小不定, 可以存放多条 fixed insn 和一条 relax insn, 所谓 fixed insn 是指类似于 `addi a0,a0,1` 这种编码确定的指令, relax insn 是指 `jal symbol` 这种不确定的指令, 以及 `.fill`, `.space` 这种的 pesudo-op.

as 会分多个阶段处理 frag 中的 relax insn, 以把它变成最终的指令.

以下面的代码为例:

.extern hello
_start:
    # 第一个 frag
    addi a0,a0,0
    addi a0,a0,10
    blt a0,a1,hello

    # 第二个 frag
    addi a0,a0,0
    .fill 20, 8, 1

    # 第三个 frag
    addi a0,a0,0
    .fill 10, 4, 1

在 `write_object_file` 中通过 `dmp_frags` dump 出 frag 信息:

before relax
--------------------
SEGMENT .text
    FRAGMENT @ 0x6bbe9cd0
    # 这个 frag 是 blt 导致的, 因为它的 target 是 symbol, 不是 fixed insn
    machine_dep
    # machine_dep 表示 fr_type, 将来 `md_convert_frag_branch` 会处理它
    13 05 05 00 13 05 05 00 63 40 b5 00 	addr=0(0x0)
    #~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~
    # 两条 addi 的编码       blt 的编码
    # 第二条 addi a0,a0,10 编码为 13 05 05 00 而不是 13 05 a5 00 是因为 addi
    # 中的 10 这个 imm 需要通过 `fixup_segment` 去 fix, 因为 addi 的 imm 部分
    # 可能是一个 label
    fr_fix=8
    # fr_fix=8 表示 frag 的数据部分 (literal) 前 8 bytes 是 fixed insn
    # fix 与 var 的区别在后面 `write_contents` 时可以看到
    fr_var=4
    # fr_var=4 表示 frag 的数据部分 (literal) 后 4 bytes 是 relax insn (例如 blt xxx 或 .fill xxx)
    fr_offset=0
    # fr_offset 对 machine_dep 没用, 但对 rs_fill 有用
    chars @ 0x6bbe9d48
    # 0x6bbe9d48 是 frag literal 的地址, 前面的 13 05 05 00 13 05 05 00 63 40 b5 00
    # 保存在这里

    FRAGMENT @ 0x6bbe9d58
    # 这个 frag 是由 fill 20 导致的. 另外, 这个 frag 的地址 (0x6bbe9d58)
    # 比上一个 frag 的 fr_literal 大 16 bytes, 虽然上一个 frag 的 fix+var=12, 但
    # as 这时知道经过后面的 relax 时它实际需要 16 bytes (参考 relaxed_branch_length 函数)
    rs_fill(20)
    13 05 05 00 01 00 00 00 00 00 00 00 		 repeated 20 times, fixed length if # chars == 0)
    #~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~
   # addi       fill 的 `8,1` 参数
    addr=0(0x0)
    fr_fix=4
    fr_var=8
    fr_offset=20
    # fr_offset=20 表示 fill 20 次
    chars @ 0x6bbe9dd0

    FRAGMENT @ 0x6bbe9de0
    # 这个 frag 由 `.fill 10` 导致
    rs_fill(10)
    13 05 05 00 01 00 00 00 		 repeated 10 times, fixed length if # chars == 0)
    addr=0(0x0)
    fr_fix=4
    fr_var=4
    fr_offset=10
    chars @ 0x6bbe9e58

    FRAGMENT @ 0x6bbe9ee0
    rs_fill(0)
             repeated 0 times, fixed length if # chars == 0)
    addr=0(0x0)
    fr_fix=0
    fr_var=0
    fr_offset=0
    chars @ 0x6bbe9f58

after relax
--------------------
SEGMENT .text
    FRAGMENT @ 0x6bbe9cd0
    machine_dep
13 05 05 00 13 05 05 00 63 40 b5 00 00 00 00 00 	addr=0(0x0)
# relax_seg 后 blt 对应的 `63 40 b5 00` 被扩展成 8 bytes, 因为 relax branch 认为需要把 blt
# 替换成 bge+jal
    fr_fix=8
    fr_var=8
    fr_offset=0
    chars @ 0x6bbe9d48

    FRAGMENT @ 0x6bbe9d58
    rs_fill(20)
    13 05 05 00 01 00 00 00 00 00 00 00 		 repeated 20 times, fixed length if # chars == 0)
    addr=16(0x10)
    fr_fix=4
    fr_var=8
    fr_offset=20
    chars @ 0x6bbe9dd0

    FRAGMENT @ 0x6bbe9de0
    rs_fill(10)
    13 05 05 00 01 00 00 00 		 repeated 10 times, fixed length if # chars == 0)
    addr=180(0xb4)
    fr_fix=4
    fr_var=4
    fr_offset=10
    chars @ 0x6bbe9e58

    FRAGMENT @ 0x6bbe9e60
    unknown type
    addr=224(0xe0)
    fr_fix=0
    fr_var=1
    fr_offset=2
    chars @ 0x6bbe9ed8

    FRAGMENT @ 0x6bbe9ee0
    rs_fill(0)
             repeated 0 times, fixed length if # chars == 0)
    addr=224(0xe0)
    fr_fix=0
    fr_var=0
    fr_offset=0
    chars @ 0x6bbe9f58

after size_seg
--------------------
SEGMENT .text
    FRAGMENT @ 0x6bbe9cd0
    rs_fill(0)
    13 05 05 00 13 05 05 00 63 54 b5 00 6f 00 00 00 		 repeated 0 times, fixed length if # chars == 0)
    # size_seg 后 blt 会替换为 bge+jal, 但地址部分还是空的, 因为 object 的布局目前还不确定
    addr=0(0x0)
    fr_fix=16
    fr_var=0
    fr_offset=0
    chars @ 0x6bbe9d48

    FRAGMENT @ 0x6bbe9d58
    rs_fill(20)
    13 05 05 00 01 00 00 00 00 00 00 00 		 repeated 20 times, fixed length if # chars == 0)
    addr=16(0x10)
    fr_fix=4
    fr_var=8
    fr_offset=20
    chars @ 0x6bbe9dd0

    FRAGMENT @ 0x6bbe9de0
    rs_fill(10)
    13 05 05 00 01 00 00 00 		 repeated 10 times, fixed length if # chars == 0)
    addr=180(0xb4)
    fr_fix=4
    fr_var=4
    fr_offset=10
    chars @ 0x6bbe9e58

final:
--------------------
SEGMENT .text
    FRAGMENT @ 0x6bbe9cd0
    rs_fill(0)
    13 05 05 00 13 05 a5 00 63 54 b5 00 6f f0 5f ff 		 repeated 0 times, fixed length if # chars == 0)
    # `write_contents` 时 frag 与最终的输出才完全一致
    addr=0(0x0)
    fr_fix=16
    # fr_fix 从 8 变成了 16, 因为 blt 已经变成了 bge+jal
    fr_var=0
    fr_offset=0
    chars @ 0x6bbe9d48

    FRAGMENT @ 0x6bbe9d58
    rs_fill(20)
    13 05 05 00 01 00 00 00 00 00 00 00 		 repeated 20 times, fixed length if # chars == 0)
    addr=16(0x10)
    fr_fix=4
    fr_var=8
    fr_offset=20
    chars @ 0x6bbe9dd0

    FRAGMENT @ 0x6bbe9de0
    rs_fill(10)
    13 05 05 00 01 00 00 00 		 repeated 10 times, fixed length if # chars == 0)
    addr=180(0xb4)
    fr_fix=4
    fr_var=4
    fr_offset=10
    chars @ 0x6bbe9e58

`write_content` 会编历所有 frag, 把 fixed 数据写到 object, 对于 rs_fill 之类的会特殊处理:

write_contents(
    bfd *abfd ATTRIBUTE_UNUSED, asection *sec, void *xxx ATTRIBUTE_UNUSED):
  for (f = seginfo->frchainP->frch_root;
       f;
       f = f->fr_next):
      /* NOTE: frag 中 fix 的部分 */
      if (f->fr_fix):
          bfd_set_section_contents (stdoutput, sec,
                    f->fr_literal, (file_ptr) offset,
                    (bfd_size_type) f->fr_fix);
          offset += f->fr_fix;

      fill_size = f->fr_var;
      count = f->fr_offset;
      fill_literal = f->fr_literal + f->fr_fix;

      /* 这里使用了 buf 来加速 fr_literal 到 bfd 的 memcpy */
      char buf[256];
      char *bufp;
      n_per_buf = sizeof (buf) / fill_size;
      for (i = n_per_buf, bufp = buf; i; i--, bufp += fill_size):
          memcpy (bufp, fill_literal, fill_size);
      for (; count > 0; count -= n_per_buf):
          n_per_buf = n_per_buf > count ? count : n_per_buf;
          bfd_set_section_contents
            (stdoutput, sec, buf, (file_ptr) offset,
             (bfd_size_type) n_per_buf * fill_size);
          offset += n_per_buf * fill_size;

1.2.7. pseudo-op

除了 riscv_opcodes 中定义的 opcode, gas 还定义了一些 directive, 例如

.ascii/.word/.byte
.fill/.rept
.if
.align
.abort
.extern
…

这些 pseudo-op 有专门的函数处理, 参考 `potable`. 例如 abort 对应的 `sabort` 会直接 abort fill 对应的 `s_fill` 则会生成 rs_fill 类型的 frag

1.3. objdump

objdump 通过 opcodes 的 disassembler 接口完成 dump 的工作, 参考 opcodes 的 print_insn_riscv 函数

1.4. BFD

BFD Tutorial

bfd 源码为有一些 elfnn-<arch>.c/elfxx_<arch>.c, nn 是指 elfnn-<arch>.c 是一个模板, 编译时根据这个模板生成针对 32/64 的代码, 以共用代码. xx 是指和 32/64 无关的部分.

例如:

Makefile.in:

elf32-riscv.c : elfnn-riscv.c
    $(AM_V_at)echo "#line 1 \"elfnn-riscv.c\"" > $@
    $(AM_V_GEN)$(SED) -e s/NN/32/g < $< >> $@

elf64-riscv.c : elfnn-riscv.c
    $(AM_V_at)echo "#line 1 \"elfnn-riscv.c\"" > $@
    $(AM_V_GEN)$(SED) -e s/NN/64/g < $< >> $@

elfnn-riscv.c:

/* ... */
#define ARCH_SIZE NN
/* ... */

1.5. usage

1.5.1. objcopy

1.5.1.1. 生成 bin

// 2023-03-29 17:59
#include <stdio.h>

int foo(){};
int main(int argc, char *argv[]) {}

$> gcc tset.c -O0 -g3
$> objcopy -O binary a.out a.bin
$> du -b
15608   a.bin

objcopy 只会复制 elf 中特定的 section: `SEC_HAS_CONTENTS | SEC_READONLY | SEC_DATA`, 并且不会复制 debug 相关的 section

$> objdump -h ./a.out

./a.out:     file format elf64-x86-64

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .interp       0000001c  0000000000000318  0000000000000318  00000318  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  1 .note.gnu.property 00000020  0000000000000338  0000000000000338  00000338  2**3
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  2 .note.gnu.build-id 00000024  0000000000000358  0000000000000358  00000358  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  3 .note.ABI-tag 00000020  000000000000037c  000000000000037c  0000037c  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  4 .gnu.hash     00000024  00000000000003a0  00000000000003a0  000003a0  2**3
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  ...
 19 .fini_array   00000008  0000000000003df8  0000000000003df8  00002df8  2**3
                  CONTENTS, ALLOC, LOAD, DATA
 20 .dynamic      000001c0  0000000000003e00  0000000000003e00  00002e00  2**3
                  CONTENTS, ALLOC, LOAD, DATA
 21 .got          00000040  0000000000003fc0  0000000000003fc0  00002fc0  2**3
                  CONTENTS, ALLOC, LOAD, DATA
 22 .data         00000010  0000000000004000  0000000000004000  00003000  2**3
                  CONTENTS, ALLOC, LOAD, DATA
 23 .bss          00000008  0000000000004010  0000000000004010  00003010  2**0
                  ALLOC
 24 .comment      0000002b  0000000000000000  0000000000000000  00003010  2**0
                  CONTENTS, READONLY
 25 .debug_aranges 00000030  0000000000000000  0000000000000000  0000303b  2**0
                  CONTENTS, READONLY, DEBUGGING, OCTETS
 26 .debug_info   0000034c  0000000000000000  0000000000000000  0000306b  2**0
                  CONTENTS, READONLY, DEBUGGING, OCTETS
 ...

section 0~22 会被复制, bss 和 debug 相关的 section 不会被复制, 总大小为 0x4000+0x10-0x318=15608

注意这个大小和统计 PT_LOAD segment 的大小并不相同:

$> readelf -a a.out
...
Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  PHDR           0x0000000000000040 0x0000000000000040 0x0000000000000040
                 0x00000000000002d8 0x00000000000002d8  R      0x8
  INTERP         0x0000000000000318 0x0000000000000318 0x0000000000000318
                 0x000000000000001c 0x000000000000001c  R      0x1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x00000000000005c8 0x00000000000005c8  R      0x1000
  LOAD           0x0000000000001000 0x0000000000001000 0x0000000000001000
                 0x00000000000001d5 0x00000000000001d5  R E    0x1000
  LOAD           0x0000000000002000 0x0000000000002000 0x0000000000002000
                 0x0000000000000158 0x0000000000000158  R      0x1000
  LOAD           0x0000000000002df0 0x0000000000003df0 0x0000000000003df0
                 0x0000000000000220 0x0000000000000228  RW     0x1000
  DYNAMIC        0x0000000000002e00 0x0000000000003e00 0x0000000000003e00
                 0x00000000000001c0 0x00000000000001c0  RW     0x8
  NOTE           0x0000000000000338 0x0000000000000338 0x0000000000000338
                 0x0000000000000020 0x0000000000000020  R      0x8
  NOTE           0x0000000000000358 0x0000000000000358 0x0000000000000358
                 0x0000000000000044 0x0000000000000044  R      0x4
  GNU_PROPERTY   0x0000000000000338 0x0000000000000338 0x0000000000000338
                 0x0000000000000020 0x0000000000000020  R      0x8
  GNU_EH_FRAME   0x0000000000002004 0x0000000000002004 0x0000000000002004
                 0x0000000000000044 0x0000000000000044  R      0x4
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     0x10
  GNU_RELRO      0x0000000000002df0 0x0000000000003df0 0x0000000000003df0
                 0x0000000000000210 0x0000000000000210  R      0x1

 Section to Segment mapping:
  Segment Sections...
   00
   01     .interp
   02     .interp .note.gnu.property .note.gnu.build-id .note.ABI-tag .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn
   03     .init .plt .plt.got .text .fini
   04     .rodata .eh_frame_hdr .eh_frame
   05     .init_array .fini_array .dynamic .got .data .bss
   06     .dynamic
   07     .note.gnu.property
   08     .note.gnu.build-id .note.ABI-tag
   09     .note.gnu.property
   10     .eh_frame_hdr
   11
   12     .init_array .fini_array .dynamic .got
...

第一个 PT_LOAD 从 0 开始 (而不是 0x318, .interp), 因为它包含了 elf 文件开头的信息, 例如 `ELF` magic number 和 elf 的 program header, 参考 ELF

另外, 第四个 PT_LOAD 包含了 bss section, 这个在 objcopy 的 bin 中也是没有的.

Backlinks

Retargeting GCC To RISC-V (Retargeting GCC To RISC-V > binutils): binutils

binutils

Table of Contents

1. binutils

1.1. opcodes

1.1.1. riscv_opcodes

1.1.1.1. INSN

1.1.1.2. INSN_ALIAS

1.1.1.3. INSN_MACRO

1.1.2. encode

1.1.3. decode

1.1.3.1. print_insn_riscv

1.1.3.2. print_insn_mips

1.1.3.3. print_insn_args

Backlinks

1.2. as

1.2.1. md_assemble

1.2.2. mips_ip

1.2.3. macro

1.2.4. append_insn

1.2.5. relax branch

1.2.6. frag

1.2.7. pseudo-op

1.3. objdump

1.4. BFD

1.5. usage

1.5.1. objcopy

1.5.1.1. 生成 bin

Backlinks