alu.rst revision 7ec681f3
17ec681f3SmrgNIR ALU Instructions
27ec681f3Smrg====================
37ec681f3Smrg
47ec681f3SmrgALU instructions represent simple operations, such as addition, multiplication,
57ec681f3Smrgcomparison, etc., that take a certain number of arguments and return a result
67ec681f3Smrgthat only depends on the arguments.  ALU instructions in NIR must be pure in
77ec681f3Smrgthe sense that they have no side effect and that identical inputs yields an
87ec681f3Smrgidentical output.  A good rule of thumb is that only things which can be
97ec681f3Smrgconstant folded should be ALU operations.  If it can't be constant folded, then
107ec681f3Smrgit should probably be an intrinsic instead.
117ec681f3Smrg
127ec681f3SmrgEach ALU instruction has an opcode, which is a member of the :cpp:enum:`nir_op`
137ec681f3Smrgenum, that describes what it does as well as how many arguments it takes.
147ec681f3SmrgAssociated with each opcode is an metadata structure,
157ec681f3Smrg:cpp:struct:`nir_op_info`, which shows how many arguments the opcode takes,
167ec681f3Smrginformation about data types, and algebraic properties such as associativity
177ec681f3Smrgand commutivity. The info structure for each opcode may be accessed through
187ec681f3Smrga global :cpp:var:`nir_op_infos` array that’s indexed by the opcode.
197ec681f3Smrg
207ec681f3SmrgALU operations are typeless, meaning that they're only defined to convert
217ec681f3Smrga certain bit-pattern input to another bit-pattern output.  The only concrete
227ec681f3Smrgnotion of types for a NIR SSA value or register is that each value has a number
237ec681f3Smrgof vector components and a bit-size.  How that data is interpreted is entirely
247ec681f3Smrgcontrolled by the opcode.  NIR doesn't have opcodes for ``intBitsToFloat()``
257ec681f3Smrgand friends because they are implicit.
267ec681f3Smrg
277ec681f3SmrgEven though ALU operations are typeless, each opcode also has an "ALU type"
287ec681f3Smrgmetadata for each of the sources and the destination which can be
297ec681f3Smrgfloating-point, boolean, integer, or unsigned integer.  The ALU type mainly
307ec681f3Smrghelps back-ends which want to handle all conversion instructions, for instance,
317ec681f3Smrgin a single switch case.  They're also important when a back-end requests the
327ec681f3Smrgabsolute value, negate, and saturate modifiers (not used by core NIR).  In that
337ec681f3Smrgcase, modifiers are interpreted with respect to the ALU type on the source or
347ec681f3Smrgdestination of the instruction.  In addition, if an operation takes a boolean
357ec681f3Smrgargument, then the argument may be assumed to be either ``0`` for false or
367ec681f3Smrg``~0`` (a.k.a ``-1``) for true even if it is not a 1-bit value.  If an
377ec681f3Smrgoperation’s result has a boolean type, then it may only produce only ``0`` or ``~0``.
387ec681f3Smrg
397ec681f3SmrgMost of the common ALU ops in NIR operate per-component, meaning that the
407ec681f3Smrgoperation is defined by what it does on a single scalar value and, when
417ec681f3Smrgperformed on vectors, it performs the same operation on each component.  Things
427ec681f3Smrglike add, multiply, etc. fall into this category.  Per-component operations
437ec681f3Smrgnaturally scale to as many components as necessary.  Non-per-component ALU ops
447ec681f3Smrgare things like :nir:alu-op:`vec4` or :nir:alu-op:`pack_64_2x32` where any
457ec681f3Smrggiven component in the result value may be a combination of any component in
467ec681f3Smrgany source.  These ops have a number of destination components and a number of
477ec681f3Smrgcomponents required by each source which is fixed by the opcode.
487ec681f3Smrg
497ec681f3SmrgWhile most instruction types in NIR require vector sizes to perfectly match on
507ec681f3Smrginputs and outputs, ALU instruction sources have an additional
517ec681f3Smrg:cpp:member:`nir_alu_src::swizzle` field which allows them to act on vectors
527ec681f3Smrgwhich are not the native vector size of the instruction.  This is ideal for
537ec681f3Smrghardware with a native data type of :c:expr:`vec4` but also means that ALU
547ec681f3Smrginstructions are often used (and required) for packing/unpacking vectors for
557ec681f3Smrguse in other instruction types like intrinsics or texture ops.
567ec681f3Smrg
577ec681f3Smrg.. doxygenstruct:: nir_op_info
587ec681f3Smrg   :members:
597ec681f3Smrg
607ec681f3Smrg.. doxygenvariable:: nir_op_infos
617ec681f3Smrg
627ec681f3Smrg.. doxygenstruct:: nir_alu_instr
637ec681f3Smrg   :members:
647ec681f3Smrg
657ec681f3Smrg.. doxygenstruct:: nir_alu_src
667ec681f3Smrg   :members:
677ec681f3Smrg
687ec681f3Smrg.. doxygenstruct:: nir_alu_dest
697ec681f3Smrg   :members:
707ec681f3Smrg
717ec681f3SmrgNIR ALU Opcode Reference:
727ec681f3Smrg-------------------------
737ec681f3Smrg
747ec681f3Smrg.. nir:alu-opcodes::
75