LTO
Table of Contents
1. LTO
LTO 即 link time optimization.
gcc 一般的优化选项比如 `-O2` 针对的都是对单个目标文件的优化, 它无法同时看到多个目标文件的信息, 导致涉及多个目标文件的优化无法进行
1.1. Example
1.1.1. 单个目标文件被优化
int inc(int a) { return a + 1; } int main(int argc, char* argv[]) { int r = 0; for (int i = 0; i < 100; i++) { r = inc(r); } return r; }
gdb -batch -ex 'disass main' /tmp/a.out
Warning: 'set logging on', an alias for the command 'set logging enabled', is deprecated. Use 'set logging enabled on'. Dump of assembler code for function main: 0x0000000000001040 <+0>: endbr64 0x0000000000001044 <+4>: mov $0x64,%eax 0x0000000000001049 <+9>: ret End of assembler dump.
1.1.2. 多个目标文件时无法被优化
int inc(int a) { return a + 1; }
extern int inc(int); int main(int argc, char* argv[]) { int r = 0; for (int i = 0; i < 100; i++) { r = inc(r); } return r; }
gcc /tmp/a.o /tmp/b.o -o /tmp/a.out gdb --batch --ex "disass main" /tmp/a.out gdb --batch --ex "disass inc" /tmp/a.out
Dump of assembler code for function main: 0x0000000000001040 <+0>: endbr64 0x0000000000001044 <+4>: push %rbx 0x0000000000001045 <+5>: xor %edi,%edi 0x0000000000001047 <+7>: mov $0x64,%ebx 0x000000000000104c <+12>: nopl 0x0(%rax) 0x0000000000001050 <+16>: call 0x1150 <inc> 0x0000000000001055 <+21>: mov %eax,%edi 0x0000000000001057 <+23>: sub $0x1,%ebx 0x000000000000105a <+26>: jne 0x1050 <main+16> 0x000000000000105c <+28>: pop %rbx 0x000000000000105d <+29>: ret End of assembler dump. Dump of assembler code for function inc: 0x0000000000001150 <+0>: endbr64 0x0000000000001154 <+4>: lea 0x1(%rdi),%eax 0x0000000000001157 <+7>: ret End of assembler dump.
1.1.3. LTO 优化多个目标文件
int inc(int a) { return a + 1; }
extern int inc(int); int main(int argc, char* argv[]) { int r = 0; for (int i = 0; i < 100; i++) { r = inc(r); } return r; }
gcc /tmp/a.o /tmp/b.o -o /tmp/a.out
gdb --batch --ex "disass main" /tmp/a.out
Dump of assembler code for function main: 0x0000000000001040 <+0>: endbr64 0x0000000000001044 <+4>: mov $0x64,%eax 0x0000000000001049 <+9>: ret End of assembler dump.
1.1.4. LTO 编译的 obj 并不是普通的 elf 文件
objdump -d /tmp/a.o readelf -a /tmp/a.o
/tmp/a.o: file format elf64-x86-64 ELF Header: Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: REL (Relocatable file) Machine: Advanced Micro Devices X86-64 Version: 0x1 Entry point address: 0x0 Start of program headers: 0 (bytes into file) Start of section headers: 2656 (bytes into file) Flags: 0x0 Size of this header: 64 (bytes) Size of program headers: 0 (bytes) Number of program headers: 0 Size of section headers: 64 (bytes) Number of section headers: 25 Section header string table index: 24 Section Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0 [ 1] .text PROGBITS 0000000000000000 00000040 0000000000000000 0000000000000000 AX 0 0 1 [ 2] .data PROGBITS 0000000000000000 00000040 0000000000000000 0000000000000000 WA 0 0 1 [ 3] .bss NOBITS 0000000000000000 00000040 0000000000000000 0000000000000000 WA 0 0 1 [ 4] .gnu.lto_.pr[...] PROGBITS 0000000000000000 00000040 000000000000000f 0000000000000000 E 0 0 1 [ 5] .gnu.lto_.ic[...] PROGBITS 0000000000000000 0000004f 000000000000001a 0000000000000000 E 0 0 1 [ 6] .gnu.lto_.ip[...] PROGBITS 0000000000000000 00000069 0000000000000011 0000000000000000 E 0 0 1 [ 7] .gnu.lto_.in[...] PROGBITS 0000000000000000 0000007a 000000000000003f 0000000000000000 E 0 0 1 [ 8] .gnu.lto_.jm[...] PROGBITS 0000000000000000 000000b9 000000000000002b 0000000000000000 E 0 0 1 [ 9] .gnu.lto_.pu[...] PROGBITS 0000000000000000 000000e4 0000000000000011 0000000000000000 E 0 0 1 [10] .gnu.lto_.ip[...] PROGBITS 0000000000000000 000000f5 0000000000000022 0000000000000000 E 0 0 1 [11] .gnu.lto_.lt[...] PROGBITS 0000000000000000 00000117 0000000000000008 0000000000000000 E 0 0 1 [12] .gnu.lto_mai[...] PROGBITS 0000000000000000 0000011f 00000000000001e5 0000000000000000 E 0 0 1 [13] .gnu.lto_.sy[...] PROGBITS 0000000000000000 00000304 000000000000004b 0000000000000000 E 0 0 1 [14] .gnu.lto_.re[...] PROGBITS 0000000000000000 0000034f 000000000000000e 0000000000000000 E 0 0 1 [15] .gnu.lto_.de[...] PROGBITS 0000000000000000 0000035d 000000000000030d 0000000000000000 E 0 0 1 [16] .gnu.lto_.sy[...] PROGBITS 0000000000000000 0000066a 0000000000000027 0000000000000000 E 0 0 1 [17] .gnu.lto_.ex[...] PROGBITS 0000000000000000 00000691 0000000000000005 0000000000000000 E 0 0 1 [18] .gnu.lto_.opts PROGBITS 0000000000000000 00000696 00000000000000c0 0000000000000000 E 0 0 1 [19] .comment PROGBITS 0000000000000000 00000756 000000000000002c 0000000000000001 MS 0 0 1 [20] .note.GNU-stack PROGBITS 0000000000000000 00000782 0000000000000000 0000000000000000 0 0 1 [21] .note.gnu.pr[...] NOTE 0000000000000000 00000788 0000000000000020 0000000000000000 A 0 0 8 [22] .symtab SYMTAB 0000000000000000 000007a8 0000000000000048 0000000000000018 23 2 8 [23] .strtab STRTAB 0000000000000000 000007f0 000000000000001f 0000000000000000 0 0 1 [24] .shstrtab STRTAB 0000000000000000 0000080f 000000000000024e 0000000000000000 0 0 1 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings), I (info), L (link order), O (extra OS processing required), G (group), T (TLS), C (compressed), x (unknown), o (OS specific), E (exclude), D (mbind), l (large), p (processor specific) There are no section groups in this file. There are no program headers in this file. There is no dynamic section in this file. There are no relocations in this file. No processor specific unwind information to decode Symbol table '.symtab' contains 3 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS C-src-hsEo7K.c 2: 0000000000000001 1 OBJECT GLOBAL DEFAULT COM __gnu_lto_slim No version information found in this file. Displaying notes found in: .note.gnu.property Owner Data size Description GNU 0x00000010 NT_GNU_PROPERTY_TYPE_0 Properties: x86 feature: IBT, SHSTK
Backlinks
Linker Relaxation (Linker Relaxation): 为了解决这两个问题, 需要 Static Linker 去优化/修改, 前者是 linker relaxation, 后 者是 LTO
Static Linker (Static Linker > LTO): LTO