======================================= Assignment #06 - Relocation and Linking ======================================= - Ian! D. Allen - idallen@idallen.ca - www.idallen.com On the "LMCLink.htm" page in the course notes you will find an LMC main program and two subroutines "Pause" and "Dble". Questions based on this web page: 1) Provide a list of the contents of each of the (relocated) three external tables that would be associated with the object code for each of the three modules Main, Pause, and Dble. (That means 3 tables for each of the three modules: 9 tables in all; some of the tables may be empty.) The tables should reflect the relocated locations of the modules loaded in the given order: Main, Pause, Dble. Hint: All this information is already worked out in the web page. ANSWER: The whole link process involves three steps: (1) copying module code into memory (2) relocating the code and tables for the new memory locations (3) resolving external references using public symbols Looking at each of the modules, we can build the nine tables: MAIN (loads starting at 00, length: 16): Public Symbol Table: no public symbols given (an omission - MAIN should be defined) (if there were a "MAIN" public symbol, it would be at 00) External Reference Table: PAUSE 02 ARG 04 DBLE 05 Relocatable Instruction Table: 00, 03, 06, 07, 08, 09, 11 PAUSE (loads starting at 16, length: 9): Public Symbol Table: PAUSE (01 + 16 =) 17 External Reference Table: no external references Relocatable Instruction Table: ([01, 02, 03, 05, 06] + 16 =) 17, 18, 19, 21, 22 DBLE (loads starting at 25, length: 8): Public Symbol Table: DBLE (01 + 25 =) 26 ARG (02 + 25 =) 27 External Reference Table: PAUSE (06 + 25 =) 31 Relocatable Instruction Table: ([01, 03, 04, 07] + 25 =) 26, 28, 29, 32 ============================================================================ 2) Assuming the modules were loaded in the new order Dble, Pause, Main (with the Dble module loading starting at location zero, followed by Pause and then by the Main module), generate the nine new relocated tables that would be associated with the object code for each of the three modules Dble, Pause, and Main. (That means 3 tables for each of the three modules: 9 tables in all; some of the tables may be empty.) The tables should reflect the relocated locations of the modules loaded in the given order: Dble, Pause, Main. ANSWER: We load in the order DBLE, PAUSE, MAIN: DBLE starts at 00 and is 8 instructions, which means Pause loads at 08. PAUSE starts at 08 and is 9 instructions, which means Main loads at 17. Having loaded the modules in memory, the linker notes the actual memory address where each module was loaded and compares it with the memory address where the module was originally built to start. Here's a summary of where the modules were originally built to load, and where the linker actually loaded them in memory: MODULE NAME BUILT TO LOAD AT ACTUALLY LOADED STARTING AT ADDRESS ----------- ---------------- ----------------------------------- DBLE 00 00 PAUSE 00 08 MAIN 00 17 The difference is the amount by which the addresses in the module need to be relocated: Relocation Offsets for each module: DBLE: built to start at "00", actually loaded at 00 - relocated by 00 - 00 = +00 PAUSE: built to start at "00", actually loaded at 08 - relocated by 08 - 00 = +08 MAIN: built to start at "00", actually loaded at 17 - relocated by 17 - 00 = +17 Now that we have the relocation offset for each of the three modules, we can relocate all the addresses that need relocation for each module. Looking at each of the modules, we can build the nine tables. Since all the tables for each module contain addresses inside the modules, all the addresses inside all the tables must be relocated when a module loads at a non-zero memory address. Here are the nine tables in relocated form: DBLE (loads starting at 00, length: 8): Public Symbol Table (after relocation): DBLE (01 + 00 =) 01 ARG (02 + 00 =) 02 External Reference Table (after relocation): PAUSE (06 + 00 =) 06 Relocatable Instruction Table (after relocation): ([01, 03, 04, 07] + 00 =) 01, 03, 04, 07 PAUSE (loads starting at 08, length: 9): Public Symbol Table (after relocation): PAUSE (01 + 08 =) 09 External Reference Table (after relocation): no external references Relocatable Instruction Table (after relocation): ([01, 02, 03, 05, 06] + 08 =) 09, 10, 11, 13, 14 MAIN (loads starting at 17, length: 16): Public Symbol Table (after relocation): no public symbols given (an omission - MAIN should be defined) (if there were a "MAIN" public symbol, it would be at 00+17=17) External Reference Table (after relocation): PAUSE (02 + 17 =) 19 ARG (04 + 17 =) 21 DBLE (05 + 17 =) 22 Relocatable Instruction Table (after relocation): ([00, 03, 06, 07, 08, 09, 11] + 17 =) 17, 20, 23, 24, 25, 26, 28 The above nine tables are now completely relocated and correctly refer to the three modules where the linker loaded them into memory. ============================================================================ 3) Using the tables from the previous question, generate the relocated and linked object code for the Main module loaded at its new address. Show where the Main module actually loads into memory, and show the original object code and the modified (relocated and linked) code for the Main module in its new location. ANSWER: We'll do the relocation and linking for all three modules, since it isn't much more work. From the previous questions, we already have the nine tables we need to do the actual relocation and linking of all three modules. We have to imitate the operation of the Linker, which means we have three basic steps: 1) Copy the modules into memory and observe where each gets loaded. 2) Relocate the tables and the instruction codes in memory. 3) Resolve external addresses for the code in memory. ----------------------------------- STEP ONE - COPY MODULES INTO MEMORY ----------------------------------- First the linker copies all the object code into memory in the given order - DBLE, PAUSE, MAIN - without any modifications: LOCATION Contents --------- -------- ------------------------ start of DBLE module (00) 00: 000 01: 903 <-- this is where the DBLE public label is 02: 000 03: 102 04: 302 05: 600 06: 000 <-- this is "CALL PAUSE" 07: 900 ------------------------ end of DBLE, start of PAUSE module (08) 08: 000 09: 107 <-- this is where the PAUSE public label is 10: 408 11: 207 12: 801 13: 901 14: 900 15: 000 16: 001 ------------------------ end of PAUSE, start of MAIN module (17) 17: 113 <-- if MAIN had a label "MAIN", it would be here 18: 600 19: 000 <-- this is "CALL PAUSE" 20: 113 21: 200 <-- this is "STO ARG" 22: 000 <-- this is "CALL DBLE" 23: 113 24: 314 25: 213 26: 415 27: 802 28: 900 29: 700 30: 000 31: 001 32: 010 The above list shows the machine instructions loaded sequentially in memory, one after the other, before any linking or relocation is done. I've added some comments to show where some key instructions and locations are. MAIN didn't have a public label "MAIN"; if it had, the label would be pointing to the first executable instruction of MAIN. Be careful to distinguish the different uses of the names DBLE and PAUSE: The *module* DBLE is 8 instructions long and is loaded starting at address 00. The *label* DBLE is a symbolic public name for a single address inside the DBLE module. They are not the same. The *module* PAUSE is 9 instructions long and is loaded starting at address 08. The *label* PAUSE is a symbolic public name for a single address inside the PAUSE module. They are not the same. Next: Relocating the code we copied into memory and fixing the tables. --------------------------------------- STEP TWO - RELOCATE THE CODE AND TABLES --------------------------------------- We have two places where this relocation work must be done: (1) the three tables associated with each module and (2) the code for the module that we just loaded into memory. (Recall: The tables contain relocatable addresses because the addresses in the tables refer to addresses inside the modules. When the modules move in memory, the values in the tables must be relocated to refer to the new locations.) We've already relocated the addresses in the tables in the previous question. We'll need the tables to help us find the relocatable addresses in the code in memory. Again, be sure about the difference between the name of a module and the name of a label inside that module: The Public Symbol "DBLE" is now located at mailbox 01; the whole *routine* that makes up DBLE was loaded starting at mailbox 00. Module DBLE is loaded at mailbox 00, with public label "DBLE" at mailbox 01. The Public Symbol "PAUSE" is now located at mailbox 09, the whole *routine* that makes up PAUSE was loaded starting at mailbox 08. Module PAUSE is loaded at mailbox 08, with public label "PAUSE" at mailbox 09. Module MAIN loaded at mailbox 17, with no public label. (If there were such a label, "MAIN", it would be at mailbox 17, the start of the code.) Having relocated all the tables, we must now relocate the addresses for the actual code for each module that is in memory, using the Relocatable Instruction Table for each module to tell us which instructions in memory to relocate. Using those three Relocatable Instruction tables, we generate the following 16 changes to the object code in memory: Using the addresses in three Relocatable Instruction Tables: LOC BEFORE AFTER REASON FOR CHANGE --- ------ ----- ------------------------------------------- 01: 903 903 ; added 00 for relocation of DBLE from 00 to 00 03: 102 102 ; same 04: 302 302 ; same 07: 900 900 ; same -------------------------- 09: 107 115 ; added 08 for relocation of PAUSE from 00 to 08 10: 408 416 ; same 11: 207 215 ; same 13: 901 909 ; same 14: 900 908 ; same -------------------------- 17: 113 130 ; added 17 for relocation MAIN from 00 to 17 20: 113 130 ; same 23: 113 130 ; same 24: 314 331 ; same 25: 213 230 ; same 26: 415 432 ; same 28: 900 917 ; same Those are the 16 instructions that needed relocating, as given by the three Relocatable Instruction tables. Next: Resolving external references. ------------------------------------------ STEP THREE - RESOLVING EXTERNAL REFERENCES ------------------------------------------ Only MAIN and DBLE have an External References table - module PAUSE does not refer to any labels outside the module. Let's resolve the external references for DBLE first: The (relocated) table of External References for DBLE shows "PAUSE 06". The linker must fix location 06 in memory with the real address of the PAUSE Public Symbol at link time. The linker looks in all the Public Symbol tables for some symbol named "PAUSE". It finds a Public Symbol named "PAUSE" in the Public Symbol Table for the module PAUSE; in that table the label "PAUSE" is shown at location 09. When the linker runs, it will use the relocated External Reference information "PAUSE 06" to go to memory location 06 (inside DBLE) and it will change the "000" at location 06 to "009" ("CALL PAUSE"), because it now knows that the Public Symbol "PAUSE" is located at memory address 09. (Note that the linker is resolving a reference to the *label* PAUSE at location 09, not the location of the whole PAUSE module at location 08.) This is the change the linker makes to the machine code for DBLE loaded in memory: LOC BEFORE AFTER REASON FOR CHANGE --- ------ ----- ---------------------------------- 06: 000 009 ; address of Public Symbol "PAUSE" linked in (added) The Pause module has no external references to resolve. Let's resolve the external references for MAIN next: The (relocated) table of External References for MAIN has three entries: PAUSE 19 ARG 21 DBLE 22 The linker must fix location 19 in memory with the real address of the "PAUSE" Public Symbol at link time. It must fix location 21 by adding in the address of the "ARG" Public Symbol. It must fix location 22 by adding in the address of the "DBLE" Public Symbol. It goes looking for these symbols in all the Public Symbol tables: The linker looks in all the Public Symbol tables for some symbol named "PAUSE". It finds a Public Symbol named "PAUSE" in the Public Symbol Table for the module PAUSE; in that table the label "PAUSE" is shown at location 09. The linker looks in all the Public Symbol tables for some symbol named "ARG". It finds a Public Symbol named "ARG" in the Public Symbol Table for module DBLE; in that table the label "ARG" is shown at location 02. The linker looks in all the Public Symbol tables for some symbol named "DBLE". It finds a Public Symbol named "DBLE" in the Public Symbol Table for module DBLE; in that table the label "DBLE" is shown at location 01. When the linker runs, it will use the (relocated) External Reference information "PAUSE 19" to go to memory location 19 (inside MAIN) and it will change the "000" at that location to "009" ("CALL PAUSE"), because it now knows that the label "PAUSE" is located at memory address 09. The linker will also go to memory location 21 and add the address where it found the Public Symbol "ARG" (location 02). The linker will also go to memory location 22 and add the address where it found the Public Symbol "DBLE" (location 01). Using that new, (relocated) External References table for MAIN, we make these next three changes to the machine codes loaded in memory: LOC BEFORE AFTER REASON FOR CHANGE --- ------ ----- ---------------------------------- 19: 000 009 ; address of "PAUSE" (09) linked in (added) 21: 200 202 ; address of "ARG" (02) linked in (added) 22: 000 001 ; address of "DBLE" (01) linked in (added) Note how the addresses of PAUSE and DBLE are simply *added* to the address fields of the instructions. In most cases, the existing address field is zero (a direct reference to the external symbol). We have made four more changes to the in-memory codes to resolve the four external references. That's it for resolving External References. We're done. We have relocated nine tables and 16 instructions in memory, and we have made 4 more changes to instructions in memory to resolve external symbols. The final result in memory, all three modules fully Linked (4 changes) and Relocated (16 changes) looks like this (20 changed instructions total): R = Relocated Instruction E = External Symbol resolved (linked) Original New Reason for change ------- ------- ------------------------------------------- ------- ------- ; start of DBLE 00: 000 00: 000 01: 903 01: 903 ; R added 00 for relocation from 00 to 00 02: 000 02: 000 03: 102 03: 102 ; R added 00 for relocation from 00 to 00 04: 302 04: 302 ; R added 00 for relocation from 00 to 00 05: 600 05: 600 06: 000 06: 009 ; E address of "PAUSE" (09) linked in (added) 07: 900 07: 900 ; R added 00 for relocation from 00 to 00 ------- ------- ; start of PAUSE 00: 000 08: 000 01: 107 09: 115 ; R added 08 for relocation from 00 to 08 02: 408 10: 416 ; R added 08 for relocation from 00 to 08 03: 207 11: 215 ; R added 08 for relocation from 00 to 08 04: 801 12: 801 05: 901 13: 909 ; R added 08 for relocation from 00 to 08 06: 900 14: 908 ; R added 08 for relocation from 00 to 08 07: 000 15: 000 08: 001 16: 001 ------- ------- ; start of MAIN 00: 113 17: 130 ; R added 17 for relocation from 00 to 17 01: 600 18: 600 02: 000 19: 009 ; E address of "PAUSE" (09) linked in (added) 03: 113 20: 130 ; R added 17 for relocation from 00 to 17 04: 200 21: 202 ; E address of "ARG" (02) linked in (added) 05: 000 22: 001 ; E address of "DBLE" (01) linked in (added) 06: 113 23: 130 ; R added 17 for relocation from 00 to 17 07: 314 24: 331 ; R added 17 for relocation from 00 to 17 08: 213 25: 230 ; R added 17 for relocation from 00 to 17 09: 415 26: 432 ; R added 17 for relocation from 00 to 17 10: 802 27: 802 11: 900 28: 917 ; R added 17 for relocation from 00 to 17 12: 700 29: 700 13: 000 30: 000 14: 001 31: 001 15: 010 32: 010 That's all. Assuming we start running this code at the beginning of MAIN (loction 17), the above code will now run as a coherhent unit, with all the address references correct.