Loaders and Linkers CS 230
2 Overview assembler –generates an object code in a predefined format »COFF (common object file format) »ELF (executable and linking format) –assigns addresses to instructions and data »assuming the program starts at location 0 loader –loads a program into memory(or AS) for execution –also performs relocation and linking relocation –modifies the absolute addresses in the object program »assembler provides necessary information linking –combines several object programs –these programs are developed independently –a program may use a symbol defined in another program
3 Bootstrap Loader bootstrapping –actions taken when a computer is first powered on –the hardware logic reads a program from address 0 of ROM (Read Only Memory) »ROM is installed by the manufacturer »ROM contains bootstrapping program and some other routines that controls hardware (e.g. BIOS) bootstrapping is a loader –loads OS from disk into memory and makes it run –the location of OS on disk (or floppy) usually starts at the first sector –starting address in memory is usually fixed to 0 »no need of relocation –this kind of loader is simple »no relocation »no linking »called absolute loader
4 Relocation assembler review –assembler generates an object code assuming that the program starts at memory address 0 –loader decides the starting address of a program –assembler generates modification record limits of modification record –record format »(address, length) –it can be huge when direct addressing is frequently used –if instruction format is fixed for absolute addressing, the length part can be removed –instead of address field, bit-vector can be used » means instruction need to be modified hardware support for relocation –base register »assembler and loader do not need to worry about relocation
5 Linking background –a large problem is better broken into several small pieces –each piece is better implemented independently »assembled (compiled) independently –there are many data structures shared among those pieces »variables and procedures –some programs are used by many different programs »print(), file operations, exp(), sin(),... »these are usually provided as library functions
6 Linking (contd) requirements for linking –each module defines »which symbols are used by other modules »symbols undefined in a module are assumed to be defined in other modules if these symbols are declared explicitly, it helps linker to resolve principles –assembler evaluates as much as possible »expressions –if some cannot be resolved, »provide the modification records
7 Linking Example modification record from module A –for statement 20, assembler »leaves 4 in the operand2 field »prepares modification record as add LISTB to operand2 field of statement 20 –for statement 24, modification record will be »put LISTB-ENDB in operand2 field of statement 24 in module B –assembler prepares a symbol table containing external symbols name and value (LISTB,60) (ENDB,80) 00 PROGA START 0 EXTRN LISTA, ENDA EXTREF LISTB, ENDB 20 REF1 LD A, LISTB+4 24 REF2 LD B, LISTB-ENDB 48 LISTA EQU * ENDA EQU * 00 PROGB START 0 EXTRN LISTB, ENDB 60 LISTB EQU * ENDB EQU *
8 Linking Loader linking –combine all the programs assembled independently –resolves external symbols »actual value of a symbol is known only at loading time linking loader: pass 1 –decides where each module will be located »this information may be given by OS –modify absolute addresses as defined in modification record in the code itself –prepare an external symbol table »(control section, symbol name, value) linking loader: pass 2 –for a modification record (say the first one of A) »search all the external symbol tables for LISTB »add 60 to the operand2 field of statement 20 –transfers the control to the program loaded
9 Library a collection of popular routines –provided as object modules –linked together with user program –programmers just use a function in the library as if it is a language feature –contains a table of symbols exported when a user program is compiled (assembled) all the undefined symbols are kept in the external symbol table linking loader –execute modification records –resolve symbols in EST with standard libraries and user-specified libraries –if there remain symbols in EST unresolved, report ERROR
10 Library (contd) there may be thousands of routines(functions) in a library –directory is needed –(routine name, address) –hashing to locate the routine name fast searching all the symbol definitions of library takes long time –each program defines which library is used »or as an option at compile(assemble) time or link time
11 Linkage Editor linking loader –link object modules into a single module and load it into memory and starts the program linkage editor –just link and does some relocation –store the result on disk instead of running –programs once finished, are stored in this form –these programs are run by relocating loader later
12 Linkage Editor (illustration) object program(s) linking loader memory object program(s) linkage editor library linked program relocating loader memory library
13 Comparison of Linking Loader and Linkage Editor what it does –linking loader »get an starting address from OS (zero for VM) »processing modification record »resolution of external symbols »produce output into memory (or disk for VM) »transfer control to the program –linkage editor »resolution of external symbols absolute addresses are not processed yet »produce output on disk
14 Comparison of Linking Loader and Linkage Editor when to use –linking loader »when a program is in a development cycle modify the program assemble and run –linkage editor »when a program development is finished »when a library is built some linking is done here
15 More about Linking Static Linking –all code modules are copied into a single executable file »the same shared module may exist in many files –a location in this file implies the location in the memory image –target address for cross-module reference can be determined before run time Dynamic Linking –needs help from OS ( that means the scheme varies depending on OS) –library modules are usually linked dynamically –inserts symbolic references in the executable file –these symbols are resolved at run time
16 Dynamic Linking of Unix GOT (Global Offset Table) –a linker allocates GOT for each library module –contains the addresses of all dynamically linked external symbols (functions and variables) referenced by the module Steps –a program referencing a library module is loaded for execution –the imported modules are loaded (unless already resident) –a region in the AS is allocated to map the module –the loader initializes the modules GOT (which may require loading of other library modules) lazy loading –load module only when it is accessed –the entry in the GOT points to stub code –the stub invokes dynamic loader which loads the referenced module and replaces the respective entries in GOT
17 Dynamic Linking against static linking –saves disk space »do not duplicate shared modules in each executable file –saves memory space »by sharing binary code –new versions of library code are immediately usable –a module in memory can be linked immediately drawbacks –extra work needs to be done to set up GOT (but much less modules need to be loaded since some are already in memory) –the location of module in AS is not determined until run time »the code of the module should be position-independent relative addresses only »indirect call via the GOT
18 MD DOS Linker object file (.OBJ) –generated by assembler (or compiler) –format THEADRname of this object module PUBDEFexternal symbols defined in this module EXTDEF external symbols used here TYPDEFdata types for pubdef and extdef SEGDEFdescribes segments in this module GRPDEFsegment grouping LNAMESname list indexed by segdef and grpdef LEDATA binary image of code LIDATArepeated data FIXUPPmodification record MODENDend
19 MS DOS Linker (contd) LINK –pass 1: »allocates segments defined in SEGDEF »resolve external symbols –pass 2: »prepare memory image if needed, disk space is also used »expand LIDATA »relocations within segment »write.EXE file
20 SunOS Linkers SunOS is UNIX two different linkers link-editor (ld) produces several types of output –relocatable object module »need to be link-edited later –static executable –dynamic executable –shared object »can be bounded at run time run-time linker –bind dynamic executables and shared object at execution time