---------------------------------------------- Real-World Compile, Assemble, and Link Example ---------------------------------------------- -Ian! D. Allen - idallen@idallen.ca - www.idallen.com A real-world (Linux) example of a compile-assemble-link sequence. Index: 1. two small C-language source files 2. main.c turned into Intel assembly language 3. main.c turned into PowerPC assembly language 4. foo.c turned into Intel assembly language 5. foo.c turned into PowerPC assembly language 6. main.c and foo.c turned into Intel object files 7. main.o and foo.o linked into an executable program 1. Two small C language source files: ==> main.c source file <================================================ static bar(){ return 2; } static int i; main(){ i = 10; foo(); bar(); } ======================================================================= ==> foo.c source file <================================================ foo(){ return 1; } ======================================================================= 2. Here is the main.c source file turned into Intel assembly language by the GCC compiler: .file "main.c" .text .type bar, @function bar: pushl %ebp movl %esp, %ebp movl $2, %eax popl %ebp ret .size bar, .-bar .globl main .type main, @function main: leal 4(%esp), %ecx andl $-16, %esp pushl -4(%ecx) pushl %ebp movl %esp, %ebp pushl %ecx subl $4, %esp movl $10, i call foo call bar addl $4, %esp popl %ecx popl %ebp leal -4(%ecx), %esp ret .size main, .-main .local i .comm i,4,4 .ident "GCC: (Ubuntu 4.3.2-1ubuntu12) 4.3.2" .section .note.GNU-stack,"",@progbits Note, above, the flagging of the one public "global" symbol "main". ======================================================================= 3. Here is the main.c source file turned into PowerPC assembly language by the GCC compiler: .file "main.c" .csect .text[PR] .toc .csect .text[PR] .align 2 .lglobl .bar .csect bar[DS] bar: .long .bar, TOC[tc0], 0 .csect .text[PR] .bar: stw 31,-4(1) stwu 1,-32(1) mr 31,1 li 0,2 mr 3,0 lwz 1,0(1) lwz 31,-4(1) blr LT..bar: .long 0 .byte 0,0,32,96,128,1,0,1 .long LT..bar-.bar .short 3 .byte "bar" .byte 31 .align 2 .toc LC..0: .tc i[TC],i .csect .text[PR] .align 2 .globl main .globl .main .csect main[DS] main: .long .main, TOC[tc0], 0 .csect .text[PR] .main: mflr 0 stw 31,-4(1) stw 0,8(1) stwu 1,-64(1) mr 31,1 lwz 9,LC..0(2) li 0,10 stw 0,0(9) bl .foo nop bl .bar lwz 1,0(1) lwz 0,8(1) mtlr 0 lwz 31,-4(1) blr LT..main: .long 0 .byte 0,0,32,97,128,1,0,1 .long LT..main-.main .short 4 .byte "main" .byte 31 .align 2 .lcomm i,4,_main.bss_ _section_.text: .csect .data[RW],3 .long _section_.text Note, above, the flagging of the one public "global" symbol "main". ======================================================================= 4. Here is the foo.c source file turned into Intel assembly language by the GCC compiler: .file "foo.c" .text .globl foo .type foo, @function foo: pushl %ebp movl %esp, %ebp movl $1, %eax popl %ebp ret .size foo, .-foo .ident "GCC: (Ubuntu 4.3.2-1ubuntu12) 4.3.2" .section .note.GNU-stack,"",@progbits Note, above, the flagging of the one public "global" symbol "foo". ======================================================================= 5. Here is the foo.c source file turned into PowerPC assembly language by the GCC compiler: .file "foo.c" .csect .text[PR] .toc .csect .text[PR] .align 2 .globl foo .globl .foo .csect foo[DS] foo: .long .foo, TOC[tc0], 0 .csect .text[PR] .foo: stw 31,-4(1) stwu 1,-32(1) mr 31,1 li 0,1 mr 3,0 lwz 1,0(1) lwz 31,-4(1) blr LT..foo: .long 0 .byte 0,0,32,96,128,1,0,1 .long LT..foo-.foo .short 3 .byte "foo" .byte 31 .align 2 _section_.text: .csect .data[RW],3 .long _section_.text Note, above, the flagging of the one public "global" symbol "foo". ======================================================================= 6. Here are the two source files turned into Intel object files with tables by the GNU assembler: GAS LISTING main.s 1 .file "main.c" 2 .text 3 .type bar, @function 4 bar: 5 0000 55 pushl %ebp 6 0001 89E5 movl %esp, %ebp 7 0003 B8020000 movl $2, %eax 7 00 8 0008 5D popl %ebp 9 0009 C3 ret 10 .size bar, .-bar 11 .globl main 12 .type main, @function 13 main: 14 000a 8D4C2404 leal 4(%esp), %ecx 15 000e 83E4F0 andl $-16, %esp 16 0011 FF71FC pushl -4(%ecx) 17 0014 55 pushl %ebp 18 0015 89E5 movl %esp, %ebp 19 0017 51 pushl %ecx 20 0018 83EC04 subl $4, %esp 21 001b C7050000 movl $10, i 21 00000A00 21 0000 22 0025 E8FCFFFF call foo 22 FF 23 002a E8D1FFFF call bar 23 FF 24 002f 83C404 addl $4, %esp 25 0032 59 popl %ecx 26 0033 5D popl %ebp 27 0034 8D61FC leal -4(%ecx), %esp 28 0037 C3 ret 29 .size main, .-main 30 .local i 31 .comm i,4,4 32 .ident "GCC: (Ubuntu 4.3.2-1ubuntu12) 4.3.2" 33 .section .note.GNU-stack,"",@progbits DEFINED SYMBOLS *ABS*:0000000000000000 main.c main.s:4 .text:0000000000000000 bar main.s:13 .text:000000000000000a main main.s:31 .bss:0000000000000000 i UNDEFINED SYMBOLS foo Tables: main.o: file format elf32-i386 architecture: i386, flags 0x00000011: HAS_RELOC, HAS_SYMS start address 0x00000000 SYMBOL TABLE: 00000000 l df *ABS* 00000000 main.c 00000000 l d .text 00000000 .text 00000000 l d .data 00000000 .data 00000000 l d .bss 00000000 .bss 00000000 l F .text 0000000a bar 00000000 l O .bss 00000004 i 00000000 l d .note.GNU-stack 00000000 .note.GNU-stack 00000000 l d .comment 00000000 .comment 0000000a g F .text 0000002e main 00000000 *UND* 00000000 foo RELOCATION RECORDS FOR [.text]: OFFSET TYPE VALUE 0000001d R_386_32 .bss 00000026 R_386_PC32 foo Note, above, the global ("g") symbol "main" and the undefined external reference to "foo". GAS LISTING foo.s 1 .file "foo.c" 2 .text 3 .globl foo 4 .type foo, @function 5 foo: 6 0000 55 pushl %ebp 7 0001 89E5 movl %esp, %ebp 8 0003 B8010000 movl $1, %eax 8 00 9 0008 5D popl %ebp 10 0009 C3 ret 11 .size foo, .-foo 12 .ident "GCC: (Ubuntu 4.3.2-1ubuntu12) 4.3.2" 13 .section .note.GNU-stack,"",@progbits DEFINED SYMBOLS *ABS*:0000000000000000 foo.c foo.s:5 .text:0000000000000000 foo NO UNDEFINED SYMBOLS Tables: foo.o: file format elf32-i386 architecture: i386, flags 0x00000010: HAS_SYMS start address 0x00000000 SYMBOL TABLE: 00000000 l df *ABS* 00000000 foo.c 00000000 l d .text 00000000 .text 00000000 l d .data 00000000 .data 00000000 l d .bss 00000000 .bss 00000000 l d .note.GNU-stack 00000000 .note.GNU-stack 00000000 l d .comment 00000000 .comment 00000000 g F .text 0000000a foo Note, above, the global ("g") symbol "foo". ======================================================================= 7. Link the two object files together, with all the libraries, and you get this memory layout: 08049f20 d _DYNAMIC 08049ff4 d _GLOBAL_OFFSET_TABLE_ 0804849c R _IO_stdin_used w _Jv_RegisterClasses 08049f10 d __CTOR_END__ 08049f0c d __CTOR_LIST__ 08049f18 D __DTOR_END__ 08049f14 d __DTOR_LIST__ 080484a0 r __FRAME_END__ 08049f1c d __JCR_END__ 08049f1c d __JCR_LIST__ 0804a010 A __bss_start 0804a008 D __data_start 08048450 t __do_global_ctors_aux 08048310 t __do_global_dtors_aux 0804a00c D __dso_handle w __gmon_start__ 0804844a T __i686.get_pc_thunk.bx 08049f0c d __init_array_end 08049f0c d __init_array_start 080483e0 T __libc_csu_fini 080483f0 T __libc_csu_init U __libc_start_main@@GLIBC_2.0 0804a010 A _edata 0804a01c A _end 0804847c T _fini 08048498 R _fp_hw 08048274 T _init 080482e0 T _start 080483a0 t bar 0804a010 b completed.6625 0804a008 W data_start 0804a014 b dtor_idx.6627 08048394 T foo 08048370 t frame_dummy 0804a018 b i 080483aa T main ======================================================================= Note, above, the memory locations for "bar", "foo", and "main", mixed in with all the compiler and library routines. Note how "foo" and "main" are global text (code) symbols (upper-case-T) and "i" and "bar" are local symbols (lower-case-b [BSS=data] and lower-case-t [text]). -- | Ian! D. Allen - idallen@idallen.ca - Ottawa, Ontario, Canada | Home Page: http://idallen.com/ Contact Improv: http://contactimprov.ca/ | College professor (Free/Libre GNU+Linux) at: http://teaching.idallen.com/ | Defend digital freedom: http://eff.org/ and have fun: http://fools.ca/