README revision 9f464c52
101e04c3fSmrgThis provides some background the design of the generated headers.  We
201e04c3fSmrgstarted out trying to generate bit fields but it evolved into the pack
301e04c3fSmrgfunctions because of a few limitations:
401e04c3fSmrg
501e04c3fSmrg  1) Bit fields still generate terrible code today. Even with modern
601e04c3fSmrg     optimizing compilers you get multiple load+mask+store operations
701e04c3fSmrg     to the same dword in memory as you set individual bits. The
801e04c3fSmrg     compiler also has to generate code to mask out overflowing values
901e04c3fSmrg     (for example, if you assign 200 to a 2 bit field). Our driver
1001e04c3fSmrg     never writes overflowing values so that's not needed. On the
1101e04c3fSmrg     other hand, most compiler recognize that the template struct we
1201e04c3fSmrg     use is a temporary variable and copy propagate the individual
1301e04c3fSmrg     fields and do amazing constant folding.  You should take a look
1401e04c3fSmrg     at the code that gets generated when you compile in release mode
1501e04c3fSmrg     with optimizations.
1601e04c3fSmrg
1701e04c3fSmrg  2) For some types we need to have overlapping bit fields. For
1801e04c3fSmrg     example, some values are 64 byte aligned 32 bit offsets. The
1901e04c3fSmrg     lower 5 bits of the offset are always zero, so the hw packs in a
2001e04c3fSmrg     few misc bits in the lower 5 bits there. Other times a field can
2101e04c3fSmrg     be either a u32 or a float. I tried to do this with overlapping
2201e04c3fSmrg     anonymous unions and it became a big mess. Also, when using
2301e04c3fSmrg     initializers, you can only initialize one union member so this
2401e04c3fSmrg     just doesn't work with out approach.
2501e04c3fSmrg
2601e04c3fSmrg     The pack functions on the other hand allows us a great deal of
2701e04c3fSmrg     flexibility in how we combine things. In the case of overlapping
2801e04c3fSmrg     fields (the u32 and float case), if we only set one of them in
2901e04c3fSmrg     the pack function, the compiler will recognize that the other is
3001e04c3fSmrg     initialized to 0 and optimize out the code to or it it.
3101e04c3fSmrg
3201e04c3fSmrg  3) Bit fields (and certainly overlapping anonymous unions of bit
3301e04c3fSmrg     fields) aren't generally stable across compilers in how they're
3401e04c3fSmrg     laid out and aligned. Our pack functions let us control exactly
3501e04c3fSmrg     how things get packed, using only simple and unambiguous bitwise
3601e04c3fSmrg     shifting and or'ing that works on any compiler.
3701e04c3fSmrg
3801e04c3fSmrgOnce we have the pack function it allows us to hook in various
3901e04c3fSmrgtransformations and validation as we go from template struct to dwords
4001e04c3fSmrgin memory:
4101e04c3fSmrg
4201e04c3fSmrg  1) Validation: As I said above, our driver isn't supposed to write
4301e04c3fSmrg     overflowing values to the fields, but we've of course had lots of
4401e04c3fSmrg     cases where we make mistakes and write overflowing values. With
4501e04c3fSmrg     the pack function, we can actually assert on that and catch it at
4601e04c3fSmrg     runtime.  bitfields would just silently truncate.
4701e04c3fSmrg
4801e04c3fSmrg  2) Type conversions: some times it's just a matter of writing a
4901e04c3fSmrg     float to a u32, but we also convert from bool to bits, from
5001e04c3fSmrg     floats to fixed point integers.
5101e04c3fSmrg
5201e04c3fSmrg  3) Relocations: whenever we have a pointer from one buffer to
5301e04c3fSmrg     another (for example a pointer from the meta data for a texture
5401e04c3fSmrg     to the raw texture data), we have to tell the kernel about it so
5501e04c3fSmrg     it can adjust the pointer to point to the final location. That
5601e04c3fSmrg     means extra work we have to do extra work to record and annotate
5701e04c3fSmrg     the dword location that holds the pointer. With bit fields, we'd
5801e04c3fSmrg     have to call a function to do this, but with the pack function we
5901e04c3fSmrg     generate code in the pack function to do this for us. That's a
6001e04c3fSmrg     lot less error prone and less work.
619f464c52Smaya
629f464c52SmayaKeeping genxml files tidy :
639f464c52Smaya
649f464c52Smaya   In order to spot differences easily between generations, we keep genxml files sorted.
659f464c52Smaya   You can trigger the sort by running :
669f464c52Smaya
679f464c52Smaya      $ cd src/intel/genxml; ./sort_xml.sh
689f464c52Smaya
699f464c52Smaya   gen_sort_tags.py is the script that sorts genxml files using with
709f464c52Smaya   the following rules :
719f464c52Smaya
729f464c52Smaya      1) Tags are grouped in the following order <enum>, <struct>,
739f464c52Smaya         <instruction>, <register>
749f464c52Smaya
759f464c52Smaya      2) <field> tags are sorted through the value of their start attribute
769f464c52Smaya
779f464c52Smaya      3) Sort <struct> tags by dependency so that other scripts have
789f464c52Smaya         everything properly ordered.
79