17ec681f3SmrgISASPEC - XML Based ISA Specification
27ec681f3Smrg=====================================
37ec681f3Smrg
47ec681f3Smrgisaspec provides a mechanism to describe an instruction set in xml, and
57ec681f3Smrggenerate a disassembler and assembler (eventually).  The intention is
67ec681f3Smrgto describe the instruction set more formally than hand-coded assembler
77ec681f3Smrgand disassembler, and better decouple the shader compiler from the
87ec681f3Smrgunderlying instruction encoding to simplify dealing with instruction
97ec681f3Smrgencoding differences between generations of GPU.
107ec681f3Smrg
117ec681f3SmrgBenefits of a formal ISA description, compared to hand-coded assemblers
127ec681f3Smrgand disassemblers, include easier detection of new bit combinations that
137ec681f3Smrgwere not seen before in previous generations due to more rigorous
147ec681f3Smrgdescription of bits that are expect to be '0' or '1' or 'x' (dontcare)
157ec681f3Smrgand verification that different encodings don't have conflicting bits
167ec681f3Smrg(ie. that the specification cannot result in more than one valid
177ec681f3Smrginterpretation of any bit pattern).
187ec681f3Smrg
197ec681f3SmrgThe isaspec tool and xml schema are intended to be generic (not specific
207ec681f3Smrgto ir3), although there are currently a couple limitations due to short-
217ec681f3Smrgcuts taken to get things up and running (which are mostly not inherent to
227ec681f3Smrgthe xml schema, and should not be too difficult to remove from the py and
237ec681f3Smrgdecode/disasm utility):
247ec681f3Smrg
257ec681f3Smrg* Maximum "field" size is 64b
267ec681f3Smrg* Fixed instruction size
277ec681f3Smrg
287ec681f3SmrgOften times, especially when new functionality is added in later gens
297ec681f3Smrgwhile retaining (or at least mostly retaining) backwards compatibility
307ec681f3Smrgwith encodings used in earlier generations, the actual encoding can be
317ec681f3Smrgrather messy to describe.  To support this, isaspec provides many flexible
327ec681f3Smrgmechanism, such as conditional overrides and derived fields.  This not
337ec681f3Smrgonly allows for describing an irregular instruction encoding, but also
347ec681f3Smrgallows matching an existing disasm syntax (which might not have been
357ec681f3Smrgdesign around the idea of disassembly based on a formal ISA description).
367ec681f3Smrg
377ec681f3SmrgBitsets
387ec681f3Smrg-------
397ec681f3Smrg
407ec681f3SmrgThe fundamental concept of matching a bit-pattern to an instruction
417ec681f3Smrgdecoding/encoding is the concept of a hierarchial tree of bitsets.
427ec681f3SmrgThis is intended to match how the hw decodes instructions, where certain
437ec681f3Smrgbits describe the instruction (and sub-encoding, and so on), and other
447ec681f3Smrgbits describe various operands to the instruction.
457ec681f3Smrg
467ec681f3SmrgBitsets can also be used recursively as the type of a field described
477ec681f3Smrgin another bitset.
487ec681f3Smrg
497ec681f3SmrgThe leaves of the tree of instruction bitsets represent every possible
507ec681f3Smrginstruction.  Deciding which instruction a bitpattern is amounts to:
517ec681f3Smrg
527ec681f3Smrg.. code-block:: c
537ec681f3Smrg
547ec681f3Smrg   m = (val & bitsets[n]->mask) & ~bitsets[n]->dontcare;
557ec681f3Smrg
567ec681f3Smrg   if (m == bitsets[n]->match) {
577ec681f3Smrg      /* we've found the instruction description */
587ec681f3Smrg   }
597ec681f3Smrg
607ec681f3SmrgFor example, the starting point to decode an ir3 instruction is a 64b
617ec681f3Smrgbitset:
627ec681f3Smrg
637ec681f3Smrg.. code-block:: xml
647ec681f3Smrg
657ec681f3Smrg   <bitset name="#instruction" size="64">
667ec681f3Smrg   	<doc>
677ec681f3Smrg   		Encoding of an ir3 instruction.  All instructions are 64b.
687ec681f3Smrg   	</doc>
697ec681f3Smrg   </bitset>
707ec681f3Smrg
717ec681f3SmrgIn the first level of instruction encoding hierarchy, the high three bits
727ec681f3Smrggroup things into instruction "categories":
737ec681f3Smrg
747ec681f3Smrg.. code-block:: xml
757ec681f3Smrg
767ec681f3Smrg   <bitset name="#instruction-cat2" extends="#instruction">
777ec681f3Smrg   	<field name="DST" low="32" high="39" type="#reg-gpr"/>
787ec681f3Smrg   	<field name="REPEAT" low="40" high="41" type="#rptN"/>
797ec681f3Smrg   	<field name="SAT" pos="42" type="bool" display="(sat)"/>
807ec681f3Smrg   	<field name="SS" pos="44" type="bool" display="(ss)"/>
817ec681f3Smrg   	<field name="UL" pos="45" type="bool" display="(ul)"/>
827ec681f3Smrg   	<field name="DST_CONV" pos="46" type="bool">
837ec681f3Smrg   		<doc>
847ec681f3Smrg   			Destination register is opposite precision as source, ie.
857ec681f3Smrg   			if {FULL} is true then destination is half precision, and
867ec681f3Smrg   			visa versa.
877ec681f3Smrg   		</doc>
887ec681f3Smrg   	</field>
897ec681f3Smrg   	<derived name="DST_HALF" expr="#dest-half" type="bool" display="h"/>
907ec681f3Smrg   	<field name="EI" pos="47" type="bool" display="(ei)"/>
917ec681f3Smrg   	<field name="FULL" pos="52" type="bool">
927ec681f3Smrg   		<doc>Full precision source registers</doc>
937ec681f3Smrg   	</field>
947ec681f3Smrg   	<field name="JP" pos="59" type="bool" display="(jp)"/>
957ec681f3Smrg   	<field name="SY" pos="60" type="bool" display="(sy)"/>
967ec681f3Smrg   	<pattern low="61" high="63">010</pattern>  <!-- cat2 -->
977ec681f3Smrg   	<!--
987ec681f3Smrg   		NOTE, both SRC1_R and SRC2_R are defined at this level because
997ec681f3Smrg   		SRC2_R is still a valid bit for (nopN) (REPEAT==0) for cat2
1007ec681f3Smrg   		instructions with only a single src
1017ec681f3Smrg   	 -->
1027ec681f3Smrg   	<field name="SRC1_R" pos="43" type="bool" display="(r)"/>
1037ec681f3Smrg   	<field name="SRC2_R" pos="51" type="bool" display="(r)"/>
1047ec681f3Smrg   	<derived name="ZERO" expr="#zero" type="bool" display=""/>
1057ec681f3Smrg   </bitset>
1067ec681f3Smrg
1077ec681f3SmrgThe ``<pattern>`` elements are the part(s) that determine which leaf-node
1087ec681f3Smrgbitset matches against a given bit pattern.  The leaf node's match/mask/
1097ec681f3Smrgdontcare bitmasks are a combination of those defined at the leaf node and
1107ec681f3Smrgrecursively each parent bitclass.
1117ec681f3Smrg
1127ec681f3SmrgFor example, cat2 instructions (ALU instructions with up to two src
1137ec681f3Smrgregisters) can have either one or two source registers:
1147ec681f3Smrg
1157ec681f3Smrg.. code-block:: xml
1167ec681f3Smrg
1177ec681f3Smrg   <bitset name="#instruction-cat2-1src" extends="#instruction-cat2">
1187ec681f3Smrg   	<override expr="#cat2-cat3-nop-encoding">
1197ec681f3Smrg   		<display>
1207ec681f3Smrg   			{SY}{SS}{JP}{SAT}(nop{NOP}) {UL}{NAME} {EI}{DST_HALF}{DST}, {SRC1}
1217ec681f3Smrg   		</display>
1227ec681f3Smrg   		<derived name="NOP" expr="#cat2-cat3-nop-value" type="uint"/>
1237ec681f3Smrg   		<field name="SRC1" low="0" high="15" type="#multisrc">
1247ec681f3Smrg   			<param name="ZERO" as="SRC_R"/>
1257ec681f3Smrg   			<param name="FULL"/>
1267ec681f3Smrg   		</field>
1277ec681f3Smrg   	</override>
1287ec681f3Smrg   	<display>
1297ec681f3Smrg   		{SY}{SS}{JP}{SAT}{REPEAT}{UL}{NAME} {EI}{DST_HALF}{DST}, {SRC1}
1307ec681f3Smrg   	</display>
1317ec681f3Smrg   	<pattern low="16" high="31">xxxxxxxxxxxxxxxx</pattern>
1327ec681f3Smrg   	<pattern low="48" high="50">xxx</pattern>  <!-- COND -->
1337ec681f3Smrg   	<field name="SRC1" low="0" high="15" type="#multisrc">
1347ec681f3Smrg   		<param name="SRC1_R" as="SRC_R"/>
1357ec681f3Smrg   		<param name="FULL"/>
1367ec681f3Smrg   	</field>
1377ec681f3Smrg   </bitset>
1387ec681f3Smrg   
1397ec681f3Smrg   <bitset name="absneg.f" extends="#instruction-cat2-1src">
1407ec681f3Smrg   	<pattern low="53" high="58">000110</pattern>
1417ec681f3Smrg   </bitset>
1427ec681f3Smrg
1437ec681f3SmrgIn this example, ``absneg.f`` is a concrete cat2 instruction (leaf node of
1447ec681f3Smrgthe bitset inheritance tree) which has a single src register.  At the
1457ec681f3Smrg``#instruction-cat2-1src`` level, bits that are used for the 2nd src arg
1467ec681f3Smrgand condition code (for cat2 instructions which use a condition code) are
1477ec681f3Smrgdefined as 'x' (dontcare), which matches our understanding of the hardware
1487ec681f3Smrg(but also lets the disassembler flag cases where '1' bits show up in places
1497ec681f3Smrgwe don't expect, which may signal a new instruction (sub)encoding).
1507ec681f3Smrg
1517ec681f3SmrgYou'll notice that ``SRC1`` refers back to a different bitset hierarchy
1527ec681f3Smrgthat describes various different src register encoding (used for cat2 and
1537ec681f3Smrgcat4 instructions), ie. GPR vs CONST vs relative GPR/CONST.  For fields
1547ec681f3Smrgwhich have bitset types, parameters can be "passed" in via ``<param>``
1557ec681f3Smrgelements, which can be referred to by the display template string, and/or
1567ec681f3Smrgexpressions.  For example, this helps to deal with cases where other fields
1577ec681f3Smrgoutside of that bitset control the encoding/decoding, such as in the
1587ec681f3Smrg``#multisrc`` example:
1597ec681f3Smrg
1607ec681f3Smrg.. code-block:: xml
1617ec681f3Smrg
1627ec681f3Smrg   <bitset name="#multisrc" size="16">
1637ec681f3Smrg   	<doc>
1647ec681f3Smrg   		Encoding for instruction source which can be GPR/CONST/IMMED
1657ec681f3Smrg   		or relative GPR/CONST.
1667ec681f3Smrg   	</doc>
1677ec681f3Smrg   </bitset>
1687ec681f3Smrg
1697ec681f3Smrg   ...
1707ec681f3Smrg
1717ec681f3Smrg   <bitset name="#multisrc-gpr" extends="#multisrc">
1727ec681f3Smrg   	<display>
1737ec681f3Smrg   		{ABSNEG}{SRC_R}{HALF}{SRC}
1747ec681f3Smrg   	</display>
1757ec681f3Smrg   	<derived name="HALF" expr="#multisrc-half" type="bool" display="h"/>
1767ec681f3Smrg   	<field name="SRC" low="0" high="7" type="#reg-gpr"/>
1777ec681f3Smrg   	<pattern low="8" high="13">000000</pattern>
1787ec681f3Smrg   	<field name="ABSNEG" low="14" high="15" type="#absneg"/>
1797ec681f3Smrg   </bitset>
1807ec681f3Smrg
1817ec681f3SmrgAt some level in the bitset inheritance hiearchy, there is expected to be a
1827ec681f3Smrg``<display>`` element specifying a template string used during bitset
1837ec681f3Smrgdecoding.  The display template consists of references to fields (which may
1847ec681f3Smrgbe derived fields) specified as ``{FIELDNAME}`` and other characters
1857ec681f3Smrgwhich are just echoed through to the resulting decoded bitset.
1867ec681f3Smrg
1877ec681f3SmrgIt is possible to define a line column alignment value per field to influence
1887ec681f3Smrgthe visual output. It needs to be pecified as ``{FIELDNAME:align=xx}``.
1897ec681f3Smrg
1907ec681f3SmrgThe ``<override>`` element will be described in the next section, but it
1917ec681f3Smrgprovides for both different decoded instruction syntax/mnemonics (when
1927ec681f3Smrgsimply providing a different display template string) as well as instruction
1937ec681f3Smrgencoding where different ranges of bits have a different meaning based on
1947ec681f3Smrgsome other bitfield (or combination of bitfields).  In this example it is
1957ec681f3Smrgused to cover the cases where ``SRCn_R`` has a different meaning and a
1967ec681f3Smrgdifferent disassembly syntax depending on whether ``REPEAT`` equals zero.
1977ec681f3Smrg
1987ec681f3SmrgOverrides
1997ec681f3Smrg---------
2007ec681f3Smrg
2017ec681f3SmrgIn many cases, a bitset is not convenient for describing the expected
2027ec681f3Smrgdisasm syntax, and/or interpretation of some range of bits differs based
2037ec681f3Smrgon some other field or combination of fields.  These *could* be modeled
2047ec681f3Smrgas different derived bitsets, at the expense of a combinatorical explosion
2057ec681f3Smrgof the size of the bitset inheritance tree.  For example, *every* cat2
2067ec681f3Smrg(and cat3) instruction has both a ``(nopN)`` interpretation in addtion to
2077ec681f3Smrgthe ``(rptN`)`` interpretation.
2087ec681f3Smrg
2097ec681f3SmrgAn ``<override>`` in a bitset allows to redefine the display string, and/or
2107ec681f3Smrgfield definitions from the default case.  If the override's expr(ession)
2117ec681f3Smrgevaluates to non-zero, ``<display>``, ``<field>``, and ``<derived>``
2127ec681f3Smrgelements take precedence over what is defined in the toplevel of the
2137ec681f3Smrgbitset (ie. the default case).
2147ec681f3Smrg
2157ec681f3SmrgExpressions
2167ec681f3Smrg-----------
2177ec681f3Smrg
2187ec681f3SmrgBoth ``<override>`` and ``<derived>`` fields make use of ``<expr>`` elements,
2197ec681f3Smrgeither defined inline, or defined and named at the top level and referred to
2207ec681f3Smrgby name in multiple other places.  An expression is a simple 'C' expression
2217ec681f3Smrgwhich can reference fields (including other derived fields) with the same
2227ec681f3Smrg``{FIELDNAME}`` syntax as display template strings.  For example:
2237ec681f3Smrg
2247ec681f3Smrg.. code-block:: xml
2257ec681f3Smrg
2267ec681f3Smrg   <expr name="#cat2-cat3-nop-encoding">
2277ec681f3Smrg   	(({SRC1_R} != 0) || ({SRC2_R} != 0)) &amp;&amp; ({REPEAT} == 0)
2287ec681f3Smrg   </expr>
2297ec681f3Smrg
2307ec681f3SmrgIn the case of ``<override>`` elements, the override applies if the expression
2317ec681f3Smrgevaluates to non-zero.  In the case of ``<derived>`` fields, the expression
2327ec681f3Smrgevaluates to the value of the derived field.
2337ec681f3Smrg
2347ec681f3SmrgEncoding
2357ec681f3Smrg--------
2367ec681f3Smrg
2377ec681f3SmrgTo facilitate instruction encoding, ``<encode>`` elements can be provided
2387ec681f3Smrgto teach the generated instruction packing code how to map from data structures
2397ec681f3Smrgrepresenting the IR to fields.  For example:
2407ec681f3Smrg
2417ec681f3Smrg.. code-block:: xml
2427ec681f3Smrg
2437ec681f3Smrg   <bitset name="#instruction" size="64">
2447ec681f3Smrg   	<doc>
2457ec681f3Smrg   		Encoding of an ir3 instruction.  All instructions are 64b.
2467ec681f3Smrg   	</doc>
2477ec681f3Smrg   	<gen min="300"/>
2487ec681f3Smrg   	<encode type="struct ir3_instruction *" case-prefix="OPC_">
2497ec681f3Smrg   		<!--
2507ec681f3Smrg   			Define mapping from encode src to individual fields,
2517ec681f3Smrg   			which are common across all instruction categories
2527ec681f3Smrg   			at the root instruction level
2537ec681f3Smrg   
2547ec681f3Smrg   			Not all of these apply to all instructions, but we
2557ec681f3Smrg   			can define mappings here for anything that is used
2567ec681f3Smrg   			in more than one instruction category.  For things
2577ec681f3Smrg   			that are specific to a single instruction category,
2587ec681f3Smrg   			mappings should be defined at that level instead.
2597ec681f3Smrg   		 -->
2607ec681f3Smrg   		<map name="DST">src->regs[0]</map>
2617ec681f3Smrg   		<map name="SRC1">src->regs[1]</map>
2627ec681f3Smrg   		<map name="SRC2">src->regs[2]</map>
2637ec681f3Smrg   		<map name="SRC3">src->regs[3]</map>
2647ec681f3Smrg   		<map name="REPEAT">src->repeat</map>
2657ec681f3Smrg   		<map name="SS">!!(src->flags &amp; IR3_INSTR_SS)</map>
2667ec681f3Smrg   		<map name="JP">!!(src->flags &amp; IR3_INSTR_JP)</map>
2677ec681f3Smrg   		<map name="SY">!!(src->flags &amp; IR3_INSTR_SY)</map>
2687ec681f3Smrg   		<map name="UL">!!(src->flags &amp; IR3_INSTR_UL)</map>
2697ec681f3Smrg   		<map name="EQ">0</map>  <!-- We don't use this (yet) -->
2707ec681f3Smrg   		<map name="SAT">!!(src->flags &amp; IR3_INSTR_SAT)</map>
2717ec681f3Smrg   	</encode>
2727ec681f3Smrg   </bitset>
2737ec681f3Smrg
2747ec681f3SmrgThe ``type`` attribute specifies that the input to encoding an instruction
2757ec681f3Smrgis a ``struct ir3_instruction *``.  In the case of bitset hierarchies with
2767ec681f3Smrgmultiple possible leaf nodes, a ``case-prefix`` attribute should be supplied
2777ec681f3Smrgalong with a function that maps the bitset encode source to an enum value
2787ec681f3Smrgwith the specified prefix prepended to uppercase'd leaf node name.  Ie. in
2797ec681f3Smrgthis case, "add.f" becomes ``OPC_ADD_F``.
2807ec681f3Smrg
2817ec681f3SmrgIndividual ``<map>`` elements teach the encoder how to map from the encode
2827ec681f3Smrgsource to fields in the encoded instruction.
283