LTO

Table of Contents

1. LTO

LTO 即 link time optimization.

gcc 一般的优化选项比如 `-O2` 针对的都是对单个目标文件的优化, 它无法同时看到多个目标文件的信息, 导致涉及多个目标文件的优化无法进行

1.1. Example

1.1.1. 单个目标文件被优化

int inc(int a) { return a + 1; }

int main(int argc, char* argv[]) {
    int r = 0;
    for (int i = 0; i < 100; i++) {
        r = inc(r);
    }
    return r;
}
gdb -batch -ex  'disass main' /tmp/a.out
Warning: 'set logging on', an alias for the command 'set logging enabled', is deprecated.
Use 'set logging enabled on'.

Dump of assembler code for function main:
   0x0000000000001040 <+0>:	endbr64
   0x0000000000001044 <+4>:	mov    $0x64,%eax
   0x0000000000001049 <+9>:	ret
End of assembler dump.

1.1.2. 多个目标文件时无法被优化

int inc(int a) { return a + 1; }
extern int inc(int);

int main(int argc, char* argv[]) {
    int r = 0;
    for (int i = 0; i < 100; i++) {
        r = inc(r);
    }
    return r;
}
gcc /tmp/a.o /tmp/b.o -o /tmp/a.out
gdb --batch --ex "disass main" /tmp/a.out
gdb --batch --ex "disass inc" /tmp/a.out
Dump of assembler code for function main:
   0x0000000000001040 <+0>:	endbr64
   0x0000000000001044 <+4>:	push   %rbx
   0x0000000000001045 <+5>:	xor    %edi,%edi
   0x0000000000001047 <+7>:	mov    $0x64,%ebx
   0x000000000000104c <+12>:	nopl   0x0(%rax)
   0x0000000000001050 <+16>:	call   0x1150 <inc>
   0x0000000000001055 <+21>:	mov    %eax,%edi
   0x0000000000001057 <+23>:	sub    $0x1,%ebx
   0x000000000000105a <+26>:	jne    0x1050 <main+16>
   0x000000000000105c <+28>:	pop    %rbx
   0x000000000000105d <+29>:	ret
End of assembler dump.

Dump of assembler code for function inc:
   0x0000000000001150 <+0>:	endbr64
   0x0000000000001154 <+4>:	lea    0x1(%rdi),%eax
   0x0000000000001157 <+7>:	ret
End of assembler dump.

1.1.3. LTO 优化多个目标文件

int inc(int a) { return a + 1; }
extern int inc(int);

int main(int argc, char* argv[]) {
    int r = 0;
    for (int i = 0; i < 100; i++) {
        r = inc(r);
    }
    return r;
}
gcc /tmp/a.o /tmp/b.o -o /tmp/a.out
gdb --batch --ex "disass main" /tmp/a.out
Dump of assembler code for function main:
   0x0000000000001040 <+0>:	endbr64
   0x0000000000001044 <+4>:	mov    $0x64,%eax
   0x0000000000001049 <+9>:	ret
End of assembler dump.

1.1.4. LTO 编译的 obj 并不是普通的 elf 文件

objdump -d /tmp/a.o
readelf -a /tmp/a.o

/tmp/a.o:     file format elf64-x86-64

ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              REL (Relocatable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          0 (bytes into file)
  Start of section headers:          2656 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           0 (bytes)
  Number of program headers:         0
  Size of section headers:           64 (bytes)
  Number of section headers:         25
  Section header string table index: 24

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .text             PROGBITS         0000000000000000  00000040
       0000000000000000  0000000000000000  AX       0     0     1
  [ 2] .data             PROGBITS         0000000000000000  00000040
       0000000000000000  0000000000000000  WA       0     0     1
  [ 3] .bss              NOBITS           0000000000000000  00000040
       0000000000000000  0000000000000000  WA       0     0     1
  [ 4] .gnu.lto_.pr[...] PROGBITS         0000000000000000  00000040
       000000000000000f  0000000000000000   E       0     0     1
  [ 5] .gnu.lto_.ic[...] PROGBITS         0000000000000000  0000004f
       000000000000001a  0000000000000000   E       0     0     1
  [ 6] .gnu.lto_.ip[...] PROGBITS         0000000000000000  00000069
       0000000000000011  0000000000000000   E       0     0     1
  [ 7] .gnu.lto_.in[...] PROGBITS         0000000000000000  0000007a
       000000000000003f  0000000000000000   E       0     0     1
  [ 8] .gnu.lto_.jm[...] PROGBITS         0000000000000000  000000b9
       000000000000002b  0000000000000000   E       0     0     1
  [ 9] .gnu.lto_.pu[...] PROGBITS         0000000000000000  000000e4
       0000000000000011  0000000000000000   E       0     0     1
  [10] .gnu.lto_.ip[...] PROGBITS         0000000000000000  000000f5
       0000000000000022  0000000000000000   E       0     0     1
  [11] .gnu.lto_.lt[...] PROGBITS         0000000000000000  00000117
       0000000000000008  0000000000000000   E       0     0     1
  [12] .gnu.lto_mai[...] PROGBITS         0000000000000000  0000011f
       00000000000001e5  0000000000000000   E       0     0     1
  [13] .gnu.lto_.sy[...] PROGBITS         0000000000000000  00000304
       000000000000004b  0000000000000000   E       0     0     1
  [14] .gnu.lto_.re[...] PROGBITS         0000000000000000  0000034f
       000000000000000e  0000000000000000   E       0     0     1
  [15] .gnu.lto_.de[...] PROGBITS         0000000000000000  0000035d
       000000000000030d  0000000000000000   E       0     0     1
  [16] .gnu.lto_.sy[...] PROGBITS         0000000000000000  0000066a
       0000000000000027  0000000000000000   E       0     0     1
  [17] .gnu.lto_.ex[...] PROGBITS         0000000000000000  00000691
       0000000000000005  0000000000000000   E       0     0     1
  [18] .gnu.lto_.opts    PROGBITS         0000000000000000  00000696
       00000000000000c0  0000000000000000   E       0     0     1
  [19] .comment          PROGBITS         0000000000000000  00000756
       000000000000002c  0000000000000001  MS       0     0     1
  [20] .note.GNU-stack   PROGBITS         0000000000000000  00000782
       0000000000000000  0000000000000000           0     0     1
  [21] .note.gnu.pr[...] NOTE             0000000000000000  00000788
       0000000000000020  0000000000000000   A       0     0     8
  [22] .symtab           SYMTAB           0000000000000000  000007a8
       0000000000000048  0000000000000018          23     2     8
  [23] .strtab           STRTAB           0000000000000000  000007f0
       000000000000001f  0000000000000000           0     0     1
  [24] .shstrtab         STRTAB           0000000000000000  0000080f
       000000000000024e  0000000000000000           0     0     1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  D (mbind), l (large), p (processor specific)

There are no section groups in this file.

There are no program headers in this file.

There is no dynamic section in this file.

There are no relocations in this file.
No processor specific unwind information to decode

Symbol table '.symtab' contains 3 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS C-src-hsEo7K.c
     2: 0000000000000001     1 OBJECT  GLOBAL DEFAULT  COM __gnu_lto_slim

No version information found in this file.

Displaying notes found in: .note.gnu.property
  Owner                Data size 	Description
  GNU                  0x00000010	NT_GNU_PROPERTY_TYPE_0
      Properties: x86 feature: IBT, SHSTK

Backlinks

Linker Relaxation (Linker Relaxation): 为了解决这两个问题, 需要 Static Linker 去优化/修改, 前者是 linker relaxation, 后 者是 LTO

Static Linker (Static Linker > LTO): LTO

Author: [email protected]
Date: 2021-09-16 Thu 00:00
Last updated: 2024-08-17 Sat 14:06

知识共享许可协议