Home | History | Annotate | Line # | Download | only in doc
mmo.texi revision 1.1.1.1
      1  1.1  skrll @section mmo backend
      2  1.1  skrll The mmo object format is used exclusively together with Professor
      3  1.1  skrll Donald E.@: Knuth's educational 64-bit processor MMIX.  The simulator
      4  1.1  skrll @command{mmix} which is available at
      5  1.1  skrll @url{http://www-cs-faculty.stanford.edu/~knuth/programs/mmix.tar.gz}
      6  1.1  skrll understands this format.  That package also includes a combined
      7  1.1  skrll assembler and linker called @command{mmixal}.  The mmo format has
      8  1.1  skrll no advantages feature-wise compared to e.g. ELF.  It is a simple
      9  1.1  skrll non-relocatable object format with no support for archives or
     10  1.1  skrll debugging information, except for symbol value information and
     11  1.1  skrll line numbers (which is not yet implemented in BFD).  See
     12  1.1  skrll @url{http://www-cs-faculty.stanford.edu/~knuth/mmix.html} for more
     13  1.1  skrll information about MMIX.  The ELF format is used for intermediate
     14  1.1  skrll object files in the BFD implementation.
     15  1.1  skrll 
     16  1.1  skrll @c We want to xref the symbol table node.  A feature in "chew"
     17  1.1  skrll @c requires that "commands" do not contain spaces in the
     18  1.1  skrll @c arguments.  Hence the hyphen in "Symbol-table".
     19  1.1  skrll @menu
     20  1.1  skrll * File layout::
     21  1.1  skrll * Symbol-table::
     22  1.1  skrll * mmo section mapping::
     23  1.1  skrll @end menu
     24  1.1  skrll 
     25  1.1  skrll @node File layout, Symbol-table, mmo, mmo
     26  1.1  skrll @subsection File layout
     27  1.1  skrll The mmo file contents is not partitioned into named sections as
     28  1.1  skrll with e.g.@: ELF.  Memory areas is formed by specifying the
     29  1.1  skrll location of the data that follows.  Only the memory area
     30  1.1  skrll @samp{0x0000@dots{}00} to @samp{0x01ff@dots{}ff} is executable, so
     31  1.1  skrll it is used for code (and constants) and the area
     32  1.1  skrll @samp{0x2000@dots{}00} to @samp{0x20ff@dots{}ff} is used for
     33  1.1  skrll writable data.  @xref{mmo section mapping}.
     34  1.1  skrll 
     35  1.1  skrll There is provision for specifying ``special data'' of 65536
     36  1.1  skrll different types.  We use type 80 (decimal), arbitrarily chosen the
     37  1.1  skrll same as the ELF @code{e_machine} number for MMIX, filling it with
     38  1.1  skrll section information normally found in ELF objects. @xref{mmo
     39  1.1  skrll section mapping}.
     40  1.1  skrll 
     41  1.1  skrll Contents is entered as 32-bit words, xor:ed over previous
     42  1.1  skrll contents, always zero-initialized.  A word that starts with the
     43  1.1  skrll byte @samp{0x98} forms a command called a @samp{lopcode}, where
     44  1.1  skrll the next byte distinguished between the thirteen lopcodes.  The
     45  1.1  skrll two remaining bytes, called the @samp{Y} and @samp{Z} fields, or
     46  1.1  skrll the @samp{YZ} field (a 16-bit big-endian number), are used for
     47  1.1  skrll various purposes different for each lopcode.  As documented in
     48  1.1  skrll @url{http://www-cs-faculty.stanford.edu/~knuth/mmixal-intro.ps.gz},
     49  1.1  skrll the lopcodes are:
     50  1.1  skrll 
     51  1.1  skrll @table @code
     52  1.1  skrll @item lop_quote
     53  1.1  skrll 0x98000001.  The next word is contents, regardless of whether it
     54  1.1  skrll starts with 0x98 or not.
     55  1.1  skrll 
     56  1.1  skrll @item lop_loc
     57  1.1  skrll 0x9801YYZZ, where @samp{Z} is 1 or 2.  This is a location
     58  1.1  skrll directive, setting the location for the next data to the next
     59  1.1  skrll 32-bit word (for @math{Z = 1}) or 64-bit word (for @math{Z = 2}),
     60  1.1  skrll plus @math{Y * 2^56}.  Normally @samp{Y} is 0 for the text segment
     61  1.1  skrll and 2 for the data segment.
     62  1.1  skrll 
     63  1.1  skrll @item lop_skip
     64  1.1  skrll 0x9802YYZZ.  Increase the current location by @samp{YZ} bytes.
     65  1.1  skrll 
     66  1.1  skrll @item lop_fixo
     67  1.1  skrll 0x9803YYZZ, where @samp{Z} is 1 or 2.  Store the current location
     68  1.1  skrll as 64 bits into the location pointed to by the next 32-bit
     69  1.1  skrll (@math{Z = 1}) or 64-bit (@math{Z = 2}) word, plus @math{Y *
     70  1.1  skrll 2^56}.
     71  1.1  skrll 
     72  1.1  skrll @item lop_fixr
     73  1.1  skrll 0x9804YYZZ.  @samp{YZ} is stored into the current location plus
     74  1.1  skrll @math{2 - 4 * YZ}.
     75  1.1  skrll 
     76  1.1  skrll @item lop_fixrx
     77  1.1  skrll 0x980500ZZ.  @samp{Z} is 16 or 24.  A value @samp{L} derived from
     78  1.1  skrll the following 32-bit word are used in a manner similar to
     79  1.1  skrll @samp{YZ} in lop_fixr: it is xor:ed into the current location
     80  1.1  skrll minus @math{4 * L}.  The first byte of the word is 0 or 1.  If it
     81  1.1  skrll is 1, then @math{L = (@var{lowest 24 bits of word}) - 2^Z}, if 0,
     82  1.1  skrll then @math{L = (@var{lowest 24 bits of word})}.
     83  1.1  skrll 
     84  1.1  skrll @item lop_file
     85  1.1  skrll 0x9806YYZZ.  @samp{Y} is the file number, @samp{Z} is count of
     86  1.1  skrll 32-bit words.  Set the file number to @samp{Y} and the line
     87  1.1  skrll counter to 0.  The next @math{Z * 4} bytes contain the file name,
     88  1.1  skrll padded with zeros if the count is not a multiple of four.  The
     89  1.1  skrll same @samp{Y} may occur multiple times, but @samp{Z} must be 0 for
     90  1.1  skrll all but the first occurrence.
     91  1.1  skrll 
     92  1.1  skrll @item lop_line
     93  1.1  skrll 0x9807YYZZ.  @samp{YZ} is the line number.  Together with
     94  1.1  skrll lop_file, it forms the source location for the next 32-bit word.
     95  1.1  skrll Note that for each non-lopcode 32-bit word, line numbers are
     96  1.1  skrll assumed incremented by one.
     97  1.1  skrll 
     98  1.1  skrll @item lop_spec
     99  1.1  skrll 0x9808YYZZ.  @samp{YZ} is the type number.  Data until the next
    100  1.1  skrll lopcode other than lop_quote forms special data of type @samp{YZ}.
    101  1.1  skrll @xref{mmo section mapping}.
    102  1.1  skrll 
    103  1.1  skrll Other types than 80, (or type 80 with a content that does not
    104  1.1  skrll parse) is stored in sections named @code{.MMIX.spec_data.@var{n}}
    105  1.1  skrll where @var{n} is the @samp{YZ}-type.  The flags for such a
    106  1.1  skrll sections say not to allocate or load the data.  The vma is 0.
    107  1.1  skrll Contents of multiple occurrences of special data @var{n} is
    108  1.1  skrll concatenated to the data of the previous lop_spec @var{n}s.  The
    109  1.1  skrll location in data or code at which the lop_spec occurred is lost.
    110  1.1  skrll 
    111  1.1  skrll @item lop_pre
    112  1.1  skrll 0x980901ZZ.  The first lopcode in a file.  The @samp{Z} field forms the
    113  1.1  skrll length of header information in 32-bit words, where the first word
    114  1.1  skrll tells the time in seconds since @samp{00:00:00 GMT Jan 1 1970}.
    115  1.1  skrll 
    116  1.1  skrll @item lop_post
    117  1.1  skrll 0x980a00ZZ.  @math{Z > 32}.  This lopcode follows after all
    118  1.1  skrll content-generating lopcodes in a program.  The @samp{Z} field
    119  1.1  skrll denotes the value of @samp{rG} at the beginning of the program.
    120  1.1  skrll The following @math{256 - Z} big-endian 64-bit words are loaded
    121  1.1  skrll into global registers @samp{$G} @dots{} @samp{$255}.
    122  1.1  skrll 
    123  1.1  skrll @item lop_stab
    124  1.1  skrll 0x980b0000.  The next-to-last lopcode in a program.  Must follow
    125  1.1  skrll immediately after the lop_post lopcode and its data.  After this
    126  1.1  skrll lopcode follows all symbols in a compressed format
    127  1.1  skrll (@pxref{Symbol-table}).
    128  1.1  skrll 
    129  1.1  skrll @item lop_end
    130  1.1  skrll 0x980cYYZZ.  The last lopcode in a program.  It must follow the
    131  1.1  skrll lop_stab lopcode and its data.  The @samp{YZ} field contains the
    132  1.1  skrll number of 32-bit words of symbol table information after the
    133  1.1  skrll preceding lop_stab lopcode.
    134  1.1  skrll @end table
    135  1.1  skrll 
    136  1.1  skrll Note that the lopcode "fixups"; @code{lop_fixr}, @code{lop_fixrx} and
    137  1.1  skrll @code{lop_fixo} are not generated by BFD, but are handled.  They are
    138  1.1  skrll generated by @code{mmixal}.
    139  1.1  skrll 
    140  1.1  skrll This trivial one-label, one-instruction file:
    141  1.1  skrll 
    142  1.1  skrll @example
    143  1.1  skrll  :Main TRAP 1,2,3
    144  1.1  skrll @end example
    145  1.1  skrll 
    146  1.1  skrll can be represented this way in mmo:
    147  1.1  skrll 
    148  1.1  skrll @example
    149  1.1  skrll  0x98090101 - lop_pre, one 32-bit word with timestamp.
    150  1.1  skrll  <timestamp>
    151  1.1  skrll  0x98010002 - lop_loc, text segment, using a 64-bit address.
    152  1.1  skrll               Note that mmixal does not emit this for the file above.
    153  1.1  skrll  0x00000000 - Address, high 32 bits.
    154  1.1  skrll  0x00000000 - Address, low 32 bits.
    155  1.1  skrll  0x98060002 - lop_file, 2 32-bit words for file-name.
    156  1.1  skrll  0x74657374 - "test"
    157  1.1  skrll  0x2e730000 - ".s\0\0"
    158  1.1  skrll  0x98070001 - lop_line, line 1.
    159  1.1  skrll  0x00010203 - TRAP 1,2,3
    160  1.1  skrll  0x980a00ff - lop_post, setting $255 to 0.
    161  1.1  skrll  0x00000000
    162  1.1  skrll  0x00000000
    163  1.1  skrll  0x980b0000 - lop_stab for ":Main" = 0, serial 1.
    164  1.1  skrll  0x203a4040   @xref{Symbol-table}.
    165  1.1  skrll  0x10404020
    166  1.1  skrll  0x4d206120
    167  1.1  skrll  0x69016e00
    168  1.1  skrll  0x81000000
    169  1.1  skrll  0x980c0005 - lop_end; symbol table contained five 32-bit words.
    170  1.1  skrll @end example
    171  1.1  skrll @node Symbol-table, mmo section mapping, File layout, mmo
    172  1.1  skrll @subsection Symbol table format
    173  1.1  skrll From mmixal.w (or really, the generated mmixal.tex) in
    174  1.1  skrll @url{http://www-cs-faculty.stanford.edu/~knuth/programs/mmix.tar.gz}):
    175  1.1  skrll ``Symbols are stored and retrieved by means of a @samp{ternary
    176  1.1  skrll search trie}, following ideas of Bentley and Sedgewick. (See
    177  1.1  skrll ACM--SIAM Symp.@: on Discrete Algorithms @samp{8} (1997), 360--369;
    178  1.1  skrll R.@:Sedgewick, @samp{Algorithms in C} (Reading, Mass.@:
    179  1.1  skrll Addison--Wesley, 1998), @samp{15.4}.)  Each trie node stores a
    180  1.1  skrll character, and there are branches to subtries for the cases where
    181  1.1  skrll a given character is less than, equal to, or greater than the
    182  1.1  skrll character in the trie.  There also is a pointer to a symbol table
    183  1.1  skrll entry if a symbol ends at the current node.''
    184  1.1  skrll 
    185  1.1  skrll So it's a tree encoded as a stream of bytes.  The stream of bytes
    186  1.1  skrll acts on a single virtual global symbol, adding and removing
    187  1.1  skrll characters and signalling complete symbol points.  Here, we read
    188  1.1  skrll the stream and create symbols at the completion points.
    189  1.1  skrll 
    190  1.1  skrll First, there's a control byte @code{m}.  If any of the listed bits
    191  1.1  skrll in @code{m} is nonzero, we execute what stands at the right, in
    192  1.1  skrll the listed order:
    193  1.1  skrll 
    194  1.1  skrll @example
    195  1.1  skrll  (MMO3_LEFT)
    196  1.1  skrll  0x40 - Traverse left trie.
    197  1.1  skrll         (Read a new command byte and recurse.)
    198  1.1  skrll 
    199  1.1  skrll  (MMO3_SYMBITS)
    200  1.1  skrll  0x2f - Read the next byte as a character and store it in the
    201  1.1  skrll         current character position; increment character position.
    202  1.1  skrll         Test the bits of @code{m}:
    203  1.1  skrll 
    204  1.1  skrll         (MMO3_WCHAR)
    205  1.1  skrll         0x80 - The character is 16-bit (so read another byte,
    206  1.1  skrll                merge into current character.
    207  1.1  skrll 
    208  1.1  skrll         (MMO3_TYPEBITS)
    209  1.1  skrll         0xf  - We have a complete symbol; parse the type, value
    210  1.1  skrll                and serial number and do what should be done
    211  1.1  skrll                with a symbol.  The type and length information
    212  1.1  skrll                is in j = (m & 0xf).
    213  1.1  skrll 
    214  1.1  skrll                (MMO3_REGQUAL_BITS)
    215  1.1  skrll                j == 0xf: A register variable.  The following
    216  1.1  skrll                          byte tells which register.
    217  1.1  skrll                j <= 8:   An absolute symbol.  Read j bytes as the
    218  1.1  skrll                          big-endian number the symbol equals.
    219  1.1  skrll                          A j = 2 with two zero bytes denotes an
    220  1.1  skrll                          unknown symbol.
    221  1.1  skrll                j > 8:    As with j <= 8, but add (0x20 << 56)
    222  1.1  skrll                          to the value in the following j - 8
    223  1.1  skrll                          bytes.
    224  1.1  skrll 
    225  1.1  skrll                Then comes the serial number, as a variant of
    226  1.1  skrll                uleb128, but better named ubeb128:
    227  1.1  skrll                Read bytes and shift the previous value left 7
    228  1.1  skrll                (multiply by 128).  Add in the new byte, repeat
    229  1.1  skrll                until a byte has bit 7 set.  The serial number
    230  1.1  skrll                is the computed value minus 128.
    231  1.1  skrll 
    232  1.1  skrll         (MMO3_MIDDLE)
    233  1.1  skrll         0x20 - Traverse middle trie.  (Read a new command byte
    234  1.1  skrll                and recurse.)  Decrement character position.
    235  1.1  skrll 
    236  1.1  skrll  (MMO3_RIGHT)
    237  1.1  skrll  0x10 - Traverse right trie.  (Read a new command byte and
    238  1.1  skrll         recurse.)
    239  1.1  skrll @end example
    240  1.1  skrll 
    241  1.1  skrll Let's look again at the @code{lop_stab} for the trivial file
    242  1.1  skrll (@pxref{File layout}).
    243  1.1  skrll 
    244  1.1  skrll @example
    245  1.1  skrll  0x980b0000 - lop_stab for ":Main" = 0, serial 1.
    246  1.1  skrll  0x203a4040
    247  1.1  skrll  0x10404020
    248  1.1  skrll  0x4d206120
    249  1.1  skrll  0x69016e00
    250  1.1  skrll  0x81000000
    251  1.1  skrll @end example
    252  1.1  skrll 
    253  1.1  skrll This forms the trivial trie (note that the path between ``:'' and
    254  1.1  skrll ``M'' is redundant):
    255  1.1  skrll 
    256  1.1  skrll @example
    257  1.1  skrll  203a     ":"
    258  1.1  skrll  40       /
    259  1.1  skrll  40      /
    260  1.1  skrll  10      \
    261  1.1  skrll  40      /
    262  1.1  skrll  40     /
    263  1.1  skrll  204d  "M"
    264  1.1  skrll  2061  "a"
    265  1.1  skrll  2069  "i"
    266  1.1  skrll  016e  "n" is the last character in a full symbol, and
    267  1.1  skrll        with a value represented in one byte.
    268  1.1  skrll  00    The value is 0.
    269  1.1  skrll  81    The serial number is 1.
    270  1.1  skrll @end example
    271  1.1  skrll 
    272  1.1  skrll @node mmo section mapping, , Symbol-table, mmo
    273  1.1  skrll @subsection mmo section mapping
    274  1.1  skrll The implementation in BFD uses special data type 80 (decimal) to
    275  1.1  skrll encapsulate and describe named sections, containing e.g.@: debug
    276  1.1  skrll information.  If needed, any datum in the encapsulation will be
    277  1.1  skrll quoted using lop_quote.  First comes a 32-bit word holding the
    278  1.1  skrll number of 32-bit words containing the zero-terminated zero-padded
    279  1.1  skrll segment name.  After the name there's a 32-bit word holding flags
    280  1.1  skrll describing the section type.  Then comes a 64-bit big-endian word
    281  1.1  skrll with the section length (in bytes), then another with the section
    282  1.1  skrll start address.  Depending on the type of section, the contents
    283  1.1  skrll might follow, zero-padded to 32-bit boundary.  For a loadable
    284  1.1  skrll section (such as data or code), the contents might follow at some
    285  1.1  skrll later point, not necessarily immediately, as a lop_loc with the
    286  1.1  skrll same start address as in the section description, followed by the
    287  1.1  skrll contents.  This in effect forms a descriptor that must be emitted
    288  1.1  skrll before the actual contents.  Sections described this way must not
    289  1.1  skrll overlap.
    290  1.1  skrll 
    291  1.1  skrll For areas that don't have such descriptors, synthetic sections are
    292  1.1  skrll formed by BFD.  Consecutive contents in the two memory areas
    293  1.1  skrll @samp{0x0000@dots{}00} to @samp{0x01ff@dots{}ff} and
    294  1.1  skrll @samp{0x2000@dots{}00} to @samp{0x20ff@dots{}ff} are entered in
    295  1.1  skrll sections named @code{.text} and @code{.data} respectively.  If an area
    296  1.1  skrll is not otherwise described, but would together with a neighboring
    297  1.1  skrll lower area be less than @samp{0x40000000} bytes long, it is joined
    298  1.1  skrll with the lower area and the gap is zero-filled.  For other cases,
    299  1.1  skrll a new section is formed, named @code{.MMIX.sec.@var{n}}.  Here,
    300  1.1  skrll @var{n} is a number, a running count through the mmo file,
    301  1.1  skrll starting at 0.
    302  1.1  skrll 
    303  1.1  skrll A loadable section specified as:
    304  1.1  skrll 
    305  1.1  skrll @example
    306  1.1  skrll  .section secname,"ax"
    307  1.1  skrll  TETRA 1,2,3,4,-1,-2009
    308  1.1  skrll  BYTE 80
    309  1.1  skrll @end example
    310  1.1  skrll 
    311  1.1  skrll and linked to address @samp{0x4}, is represented by the sequence:
    312  1.1  skrll 
    313  1.1  skrll @example
    314  1.1  skrll  0x98080050 - lop_spec 80
    315  1.1  skrll  0x00000002 - two 32-bit words for the section name
    316  1.1  skrll  0x7365636e - "secn"
    317  1.1  skrll  0x616d6500 - "ame\0"
    318  1.1  skrll  0x00000033 - flags CODE, READONLY, LOAD, ALLOC
    319  1.1  skrll  0x00000000 - high 32 bits of section length
    320  1.1  skrll  0x0000001c - section length is 28 bytes; 6 * 4 + 1 + alignment to 32 bits
    321  1.1  skrll  0x00000000 - high 32 bits of section address
    322  1.1  skrll  0x00000004 - section address is 4
    323  1.1  skrll  0x98010002 - 64 bits with address of following data
    324  1.1  skrll  0x00000000 - high 32 bits of address
    325  1.1  skrll  0x00000004 - low 32 bits: data starts at address 4
    326  1.1  skrll  0x00000001 - 1
    327  1.1  skrll  0x00000002 - 2
    328  1.1  skrll  0x00000003 - 3
    329  1.1  skrll  0x00000004 - 4
    330  1.1  skrll  0xffffffff - -1
    331  1.1  skrll  0xfffff827 - -2009
    332  1.1  skrll  0x50000000 - 80 as a byte, padded with zeros.
    333  1.1  skrll @end example
    334  1.1  skrll 
    335  1.1  skrll Note that the lop_spec wrapping does not include the section
    336  1.1  skrll contents.  Compare this to a non-loaded section specified as:
    337  1.1  skrll 
    338  1.1  skrll @example
    339  1.1  skrll  .section thirdsec
    340  1.1  skrll  TETRA 200001,100002
    341  1.1  skrll  BYTE 38,40
    342  1.1  skrll @end example
    343  1.1  skrll 
    344  1.1  skrll This, when linked to address @samp{0x200000000000001c}, is
    345  1.1  skrll represented by:
    346  1.1  skrll 
    347  1.1  skrll @example
    348  1.1  skrll  0x98080050 - lop_spec 80
    349  1.1  skrll  0x00000002 - two 32-bit words for the section name
    350  1.1  skrll  0x7365636e - "thir"
    351  1.1  skrll  0x616d6500 - "dsec"
    352  1.1  skrll  0x00000010 - flag READONLY
    353  1.1  skrll  0x00000000 - high 32 bits of section length
    354  1.1  skrll  0x0000000c - section length is 12 bytes; 2 * 4 + 2 + alignment to 32 bits
    355  1.1  skrll  0x20000000 - high 32 bits of address
    356  1.1  skrll  0x0000001c - low 32 bits of address 0x200000000000001c
    357  1.1  skrll  0x00030d41 - 200001
    358  1.1  skrll  0x000186a2 - 100002
    359  1.1  skrll  0x26280000 - 38, 40 as bytes, padded with zeros
    360  1.1  skrll @end example
    361  1.1  skrll 
    362  1.1  skrll For the latter example, the section contents must not be
    363  1.1  skrll loaded in memory, and is therefore specified as part of the
    364  1.1  skrll special data.  The address is usually unimportant but might
    365  1.1  skrll provide information for e.g.@: the DWARF 2 debugging format.
    366