17ec681f3SmrgISASPEC - XML Based ISA Specification 27ec681f3Smrg===================================== 37ec681f3Smrg 47ec681f3Smrgisaspec provides a mechanism to describe an instruction set in xml, and 57ec681f3Smrggenerate a disassembler and assembler (eventually). The intention is 67ec681f3Smrgto describe the instruction set more formally than hand-coded assembler 77ec681f3Smrgand disassembler, and better decouple the shader compiler from the 87ec681f3Smrgunderlying instruction encoding to simplify dealing with instruction 97ec681f3Smrgencoding differences between generations of GPU. 107ec681f3Smrg 117ec681f3SmrgBenefits of a formal ISA description, compared to hand-coded assemblers 127ec681f3Smrgand disassemblers, include easier detection of new bit combinations that 137ec681f3Smrgwere not seen before in previous generations due to more rigorous 147ec681f3Smrgdescription of bits that are expect to be '0' or '1' or 'x' (dontcare) 157ec681f3Smrgand verification that different encodings don't have conflicting bits 167ec681f3Smrg(ie. that the specification cannot result in more than one valid 177ec681f3Smrginterpretation of any bit pattern). 187ec681f3Smrg 197ec681f3SmrgThe isaspec tool and xml schema are intended to be generic (not specific 207ec681f3Smrgto ir3), although there are currently a couple limitations due to short- 217ec681f3Smrgcuts taken to get things up and running (which are mostly not inherent to 227ec681f3Smrgthe xml schema, and should not be too difficult to remove from the py and 237ec681f3Smrgdecode/disasm utility): 247ec681f3Smrg 257ec681f3Smrg* Maximum "field" size is 64b 267ec681f3Smrg* Fixed instruction size 277ec681f3Smrg 287ec681f3SmrgOften times, especially when new functionality is added in later gens 297ec681f3Smrgwhile retaining (or at least mostly retaining) backwards compatibility 307ec681f3Smrgwith encodings used in earlier generations, the actual encoding can be 317ec681f3Smrgrather messy to describe. To support this, isaspec provides many flexible 327ec681f3Smrgmechanism, such as conditional overrides and derived fields. This not 337ec681f3Smrgonly allows for describing an irregular instruction encoding, but also 347ec681f3Smrgallows matching an existing disasm syntax (which might not have been 357ec681f3Smrgdesign around the idea of disassembly based on a formal ISA description). 367ec681f3Smrg 377ec681f3SmrgBitsets 387ec681f3Smrg------- 397ec681f3Smrg 407ec681f3SmrgThe fundamental concept of matching a bit-pattern to an instruction 417ec681f3Smrgdecoding/encoding is the concept of a hierarchial tree of bitsets. 427ec681f3SmrgThis is intended to match how the hw decodes instructions, where certain 437ec681f3Smrgbits describe the instruction (and sub-encoding, and so on), and other 447ec681f3Smrgbits describe various operands to the instruction. 457ec681f3Smrg 467ec681f3SmrgBitsets can also be used recursively as the type of a field described 477ec681f3Smrgin another bitset. 487ec681f3Smrg 497ec681f3SmrgThe leaves of the tree of instruction bitsets represent every possible 507ec681f3Smrginstruction. Deciding which instruction a bitpattern is amounts to: 517ec681f3Smrg 527ec681f3Smrg.. code-block:: c 537ec681f3Smrg 547ec681f3Smrg m = (val & bitsets[n]->mask) & ~bitsets[n]->dontcare; 557ec681f3Smrg 567ec681f3Smrg if (m == bitsets[n]->match) { 577ec681f3Smrg /* we've found the instruction description */ 587ec681f3Smrg } 597ec681f3Smrg 607ec681f3SmrgFor example, the starting point to decode an ir3 instruction is a 64b 617ec681f3Smrgbitset: 627ec681f3Smrg 637ec681f3Smrg.. code-block:: xml 647ec681f3Smrg 657ec681f3Smrg <bitset name="#instruction" size="64"> 667ec681f3Smrg <doc> 677ec681f3Smrg Encoding of an ir3 instruction. All instructions are 64b. 687ec681f3Smrg </doc> 697ec681f3Smrg </bitset> 707ec681f3Smrg 717ec681f3SmrgIn the first level of instruction encoding hierarchy, the high three bits 727ec681f3Smrggroup things into instruction "categories": 737ec681f3Smrg 747ec681f3Smrg.. code-block:: xml 757ec681f3Smrg 767ec681f3Smrg <bitset name="#instruction-cat2" extends="#instruction"> 777ec681f3Smrg <field name="DST" low="32" high="39" type="#reg-gpr"/> 787ec681f3Smrg <field name="REPEAT" low="40" high="41" type="#rptN"/> 797ec681f3Smrg <field name="SAT" pos="42" type="bool" display="(sat)"/> 807ec681f3Smrg <field name="SS" pos="44" type="bool" display="(ss)"/> 817ec681f3Smrg <field name="UL" pos="45" type="bool" display="(ul)"/> 827ec681f3Smrg <field name="DST_CONV" pos="46" type="bool"> 837ec681f3Smrg <doc> 847ec681f3Smrg Destination register is opposite precision as source, ie. 857ec681f3Smrg if {FULL} is true then destination is half precision, and 867ec681f3Smrg visa versa. 877ec681f3Smrg </doc> 887ec681f3Smrg </field> 897ec681f3Smrg <derived name="DST_HALF" expr="#dest-half" type="bool" display="h"/> 907ec681f3Smrg <field name="EI" pos="47" type="bool" display="(ei)"/> 917ec681f3Smrg <field name="FULL" pos="52" type="bool"> 927ec681f3Smrg <doc>Full precision source registers</doc> 937ec681f3Smrg </field> 947ec681f3Smrg <field name="JP" pos="59" type="bool" display="(jp)"/> 957ec681f3Smrg <field name="SY" pos="60" type="bool" display="(sy)"/> 967ec681f3Smrg <pattern low="61" high="63">010</pattern> <!-- cat2 --> 977ec681f3Smrg <!-- 987ec681f3Smrg NOTE, both SRC1_R and SRC2_R are defined at this level because 997ec681f3Smrg SRC2_R is still a valid bit for (nopN) (REPEAT==0) for cat2 1007ec681f3Smrg instructions with only a single src 1017ec681f3Smrg --> 1027ec681f3Smrg <field name="SRC1_R" pos="43" type="bool" display="(r)"/> 1037ec681f3Smrg <field name="SRC2_R" pos="51" type="bool" display="(r)"/> 1047ec681f3Smrg <derived name="ZERO" expr="#zero" type="bool" display=""/> 1057ec681f3Smrg </bitset> 1067ec681f3Smrg 1077ec681f3SmrgThe ``<pattern>`` elements are the part(s) that determine which leaf-node 1087ec681f3Smrgbitset matches against a given bit pattern. The leaf node's match/mask/ 1097ec681f3Smrgdontcare bitmasks are a combination of those defined at the leaf node and 1107ec681f3Smrgrecursively each parent bitclass. 1117ec681f3Smrg 1127ec681f3SmrgFor example, cat2 instructions (ALU instructions with up to two src 1137ec681f3Smrgregisters) can have either one or two source registers: 1147ec681f3Smrg 1157ec681f3Smrg.. code-block:: xml 1167ec681f3Smrg 1177ec681f3Smrg <bitset name="#instruction-cat2-1src" extends="#instruction-cat2"> 1187ec681f3Smrg <override expr="#cat2-cat3-nop-encoding"> 1197ec681f3Smrg <display> 1207ec681f3Smrg {SY}{SS}{JP}{SAT}(nop{NOP}) {UL}{NAME} {EI}{DST_HALF}{DST}, {SRC1} 1217ec681f3Smrg </display> 1227ec681f3Smrg <derived name="NOP" expr="#cat2-cat3-nop-value" type="uint"/> 1237ec681f3Smrg <field name="SRC1" low="0" high="15" type="#multisrc"> 1247ec681f3Smrg <param name="ZERO" as="SRC_R"/> 1257ec681f3Smrg <param name="FULL"/> 1267ec681f3Smrg </field> 1277ec681f3Smrg </override> 1287ec681f3Smrg <display> 1297ec681f3Smrg {SY}{SS}{JP}{SAT}{REPEAT}{UL}{NAME} {EI}{DST_HALF}{DST}, {SRC1} 1307ec681f3Smrg </display> 1317ec681f3Smrg <pattern low="16" high="31">xxxxxxxxxxxxxxxx</pattern> 1327ec681f3Smrg <pattern low="48" high="50">xxx</pattern> <!-- COND --> 1337ec681f3Smrg <field name="SRC1" low="0" high="15" type="#multisrc"> 1347ec681f3Smrg <param name="SRC1_R" as="SRC_R"/> 1357ec681f3Smrg <param name="FULL"/> 1367ec681f3Smrg </field> 1377ec681f3Smrg </bitset> 1387ec681f3Smrg 1397ec681f3Smrg <bitset name="absneg.f" extends="#instruction-cat2-1src"> 1407ec681f3Smrg <pattern low="53" high="58">000110</pattern> 1417ec681f3Smrg </bitset> 1427ec681f3Smrg 1437ec681f3SmrgIn this example, ``absneg.f`` is a concrete cat2 instruction (leaf node of 1447ec681f3Smrgthe bitset inheritance tree) which has a single src register. At the 1457ec681f3Smrg``#instruction-cat2-1src`` level, bits that are used for the 2nd src arg 1467ec681f3Smrgand condition code (for cat2 instructions which use a condition code) are 1477ec681f3Smrgdefined as 'x' (dontcare), which matches our understanding of the hardware 1487ec681f3Smrg(but also lets the disassembler flag cases where '1' bits show up in places 1497ec681f3Smrgwe don't expect, which may signal a new instruction (sub)encoding). 1507ec681f3Smrg 1517ec681f3SmrgYou'll notice that ``SRC1`` refers back to a different bitset hierarchy 1527ec681f3Smrgthat describes various different src register encoding (used for cat2 and 1537ec681f3Smrgcat4 instructions), ie. GPR vs CONST vs relative GPR/CONST. For fields 1547ec681f3Smrgwhich have bitset types, parameters can be "passed" in via ``<param>`` 1557ec681f3Smrgelements, which can be referred to by the display template string, and/or 1567ec681f3Smrgexpressions. For example, this helps to deal with cases where other fields 1577ec681f3Smrgoutside of that bitset control the encoding/decoding, such as in the 1587ec681f3Smrg``#multisrc`` example: 1597ec681f3Smrg 1607ec681f3Smrg.. code-block:: xml 1617ec681f3Smrg 1627ec681f3Smrg <bitset name="#multisrc" size="16"> 1637ec681f3Smrg <doc> 1647ec681f3Smrg Encoding for instruction source which can be GPR/CONST/IMMED 1657ec681f3Smrg or relative GPR/CONST. 1667ec681f3Smrg </doc> 1677ec681f3Smrg </bitset> 1687ec681f3Smrg 1697ec681f3Smrg ... 1707ec681f3Smrg 1717ec681f3Smrg <bitset name="#multisrc-gpr" extends="#multisrc"> 1727ec681f3Smrg <display> 1737ec681f3Smrg {ABSNEG}{SRC_R}{HALF}{SRC} 1747ec681f3Smrg </display> 1757ec681f3Smrg <derived name="HALF" expr="#multisrc-half" type="bool" display="h"/> 1767ec681f3Smrg <field name="SRC" low="0" high="7" type="#reg-gpr"/> 1777ec681f3Smrg <pattern low="8" high="13">000000</pattern> 1787ec681f3Smrg <field name="ABSNEG" low="14" high="15" type="#absneg"/> 1797ec681f3Smrg </bitset> 1807ec681f3Smrg 1817ec681f3SmrgAt some level in the bitset inheritance hiearchy, there is expected to be a 1827ec681f3Smrg``<display>`` element specifying a template string used during bitset 1837ec681f3Smrgdecoding. The display template consists of references to fields (which may 1847ec681f3Smrgbe derived fields) specified as ``{FIELDNAME}`` and other characters 1857ec681f3Smrgwhich are just echoed through to the resulting decoded bitset. 1867ec681f3Smrg 1877ec681f3SmrgIt is possible to define a line column alignment value per field to influence 1887ec681f3Smrgthe visual output. It needs to be pecified as ``{FIELDNAME:align=xx}``. 1897ec681f3Smrg 1907ec681f3SmrgThe ``<override>`` element will be described in the next section, but it 1917ec681f3Smrgprovides for both different decoded instruction syntax/mnemonics (when 1927ec681f3Smrgsimply providing a different display template string) as well as instruction 1937ec681f3Smrgencoding where different ranges of bits have a different meaning based on 1947ec681f3Smrgsome other bitfield (or combination of bitfields). In this example it is 1957ec681f3Smrgused to cover the cases where ``SRCn_R`` has a different meaning and a 1967ec681f3Smrgdifferent disassembly syntax depending on whether ``REPEAT`` equals zero. 1977ec681f3Smrg 1987ec681f3SmrgOverrides 1997ec681f3Smrg--------- 2007ec681f3Smrg 2017ec681f3SmrgIn many cases, a bitset is not convenient for describing the expected 2027ec681f3Smrgdisasm syntax, and/or interpretation of some range of bits differs based 2037ec681f3Smrgon some other field or combination of fields. These *could* be modeled 2047ec681f3Smrgas different derived bitsets, at the expense of a combinatorical explosion 2057ec681f3Smrgof the size of the bitset inheritance tree. For example, *every* cat2 2067ec681f3Smrg(and cat3) instruction has both a ``(nopN)`` interpretation in addtion to 2077ec681f3Smrgthe ``(rptN`)`` interpretation. 2087ec681f3Smrg 2097ec681f3SmrgAn ``<override>`` in a bitset allows to redefine the display string, and/or 2107ec681f3Smrgfield definitions from the default case. If the override's expr(ession) 2117ec681f3Smrgevaluates to non-zero, ``<display>``, ``<field>``, and ``<derived>`` 2127ec681f3Smrgelements take precedence over what is defined in the toplevel of the 2137ec681f3Smrgbitset (ie. the default case). 2147ec681f3Smrg 2157ec681f3SmrgExpressions 2167ec681f3Smrg----------- 2177ec681f3Smrg 2187ec681f3SmrgBoth ``<override>`` and ``<derived>`` fields make use of ``<expr>`` elements, 2197ec681f3Smrgeither defined inline, or defined and named at the top level and referred to 2207ec681f3Smrgby name in multiple other places. An expression is a simple 'C' expression 2217ec681f3Smrgwhich can reference fields (including other derived fields) with the same 2227ec681f3Smrg``{FIELDNAME}`` syntax as display template strings. For example: 2237ec681f3Smrg 2247ec681f3Smrg.. code-block:: xml 2257ec681f3Smrg 2267ec681f3Smrg <expr name="#cat2-cat3-nop-encoding"> 2277ec681f3Smrg (({SRC1_R} != 0) || ({SRC2_R} != 0)) && ({REPEAT} == 0) 2287ec681f3Smrg </expr> 2297ec681f3Smrg 2307ec681f3SmrgIn the case of ``<override>`` elements, the override applies if the expression 2317ec681f3Smrgevaluates to non-zero. In the case of ``<derived>`` fields, the expression 2327ec681f3Smrgevaluates to the value of the derived field. 2337ec681f3Smrg 2347ec681f3SmrgEncoding 2357ec681f3Smrg-------- 2367ec681f3Smrg 2377ec681f3SmrgTo facilitate instruction encoding, ``<encode>`` elements can be provided 2387ec681f3Smrgto teach the generated instruction packing code how to map from data structures 2397ec681f3Smrgrepresenting the IR to fields. For example: 2407ec681f3Smrg 2417ec681f3Smrg.. code-block:: xml 2427ec681f3Smrg 2437ec681f3Smrg <bitset name="#instruction" size="64"> 2447ec681f3Smrg <doc> 2457ec681f3Smrg Encoding of an ir3 instruction. All instructions are 64b. 2467ec681f3Smrg </doc> 2477ec681f3Smrg <gen min="300"/> 2487ec681f3Smrg <encode type="struct ir3_instruction *" case-prefix="OPC_"> 2497ec681f3Smrg <!-- 2507ec681f3Smrg Define mapping from encode src to individual fields, 2517ec681f3Smrg which are common across all instruction categories 2527ec681f3Smrg at the root instruction level 2537ec681f3Smrg 2547ec681f3Smrg Not all of these apply to all instructions, but we 2557ec681f3Smrg can define mappings here for anything that is used 2567ec681f3Smrg in more than one instruction category. For things 2577ec681f3Smrg that are specific to a single instruction category, 2587ec681f3Smrg mappings should be defined at that level instead. 2597ec681f3Smrg --> 2607ec681f3Smrg <map name="DST">src->regs[0]</map> 2617ec681f3Smrg <map name="SRC1">src->regs[1]</map> 2627ec681f3Smrg <map name="SRC2">src->regs[2]</map> 2637ec681f3Smrg <map name="SRC3">src->regs[3]</map> 2647ec681f3Smrg <map name="REPEAT">src->repeat</map> 2657ec681f3Smrg <map name="SS">!!(src->flags & IR3_INSTR_SS)</map> 2667ec681f3Smrg <map name="JP">!!(src->flags & IR3_INSTR_JP)</map> 2677ec681f3Smrg <map name="SY">!!(src->flags & IR3_INSTR_SY)</map> 2687ec681f3Smrg <map name="UL">!!(src->flags & IR3_INSTR_UL)</map> 2697ec681f3Smrg <map name="EQ">0</map> <!-- We don't use this (yet) --> 2707ec681f3Smrg <map name="SAT">!!(src->flags & IR3_INSTR_SAT)</map> 2717ec681f3Smrg </encode> 2727ec681f3Smrg </bitset> 2737ec681f3Smrg 2747ec681f3SmrgThe ``type`` attribute specifies that the input to encoding an instruction 2757ec681f3Smrgis a ``struct ir3_instruction *``. In the case of bitset hierarchies with 2767ec681f3Smrgmultiple possible leaf nodes, a ``case-prefix`` attribute should be supplied 2777ec681f3Smrgalong with a function that maps the bitset encode source to an enum value 2787ec681f3Smrgwith the specified prefix prepended to uppercase'd leaf node name. Ie. in 2797ec681f3Smrgthis case, "add.f" becomes ``OPC_ADD_F``. 2807ec681f3Smrg 2817ec681f3SmrgIndividual ``<map>`` elements teach the encoder how to map from the encode 2827ec681f3Smrgsource to fields in the encoded instruction. 283