1
2
3                          XAA.HOWTO
4
5  This file describes how to add basic XAA support to a chipset driver.
6
70)  What is XAA
81)  XAA Initialization and Shutdown
92)  The Primitives
10  2.0  Generic Flags
11  2.1  Screen to Screen Copies
12  2.2  Solid Fills
13  2.3  Solid Lines
14  2.4  Dashed Lines
15  2.5  Color Expand Fills
16    2.5.1 Screen to Screen Color Expansion
17    2.5.2 CPU to Screen Color Expansion
18      2.5.2.1 The Direct Method
19      2.5.2.2 The Indirect Method
20  2.6  8x8 Mono Pattern Fills
21  2.7  8x8 Color Pattern Fills
22  2.8  Image Writes
23    2.8.1 The Direct Method
24    2.8.2 The Indirect Method
25  2.9 Clipping
263)  The Pixmap Cache
274)  Offscreen Pixmaps
28
29/********************************************************************/
30
310) WHAT IS XAA
32	
33   XAA (the XFree86 Acceleration Architecture) is a device dependent
34layer that encapsulates the unaccelerated framebuffer rendering layer,
35intercepting rendering commands sent to it from higher levels of the
36server.  For rendering tasks where hardware acceleration is not 
37possible, XAA allows the requests to proceed to the software rendering
38code.  Otherwise, XAA breaks the sometimes complicated X primitives
39into simpler primitives more suitable for hardware acceleration and
40will use accelerated functions exported by the chipset driver to 
41render these.
42
43   XAA provides a simple, easy to use driver interface that allows
44the driver to communicate its acceleration capabilities and restrictions
45back to XAA.  XAA will use the information provided by the driver
46to determine whether or not acceleration will be possible for a
47particular X primitive.
48
49
50
511) XAA INITIALIZATION AND SHUTDOWN
52
53   All relevant prototypes and defines are in xaa.h.
54
55   To Initialize the XAA layer, the driver should allocate an XAAInfoRec
56via XAACreateInfoRec(), fill it out as described in this document
57and pass it to XAAInit().  XAAInit() must be called _after_ the 
58framebuffer initialization (usually cfb?ScreenInit or similar) since 
59it is "wrapping" that layer.  XAAInit() should be called _before_ the 
60cursor initialization (usually miDCInitialize) since the cursor
61layer needs to "wrap" all the rendering code including XAA.
62
63   When shutting down, the driver should free the XAAInfoRec
64structure in its CloseScreen function via XAADestroyInfoRec().
65The prototypes for the functions mentioned above are as follows:
66
67   XAAInfoRecPtr XAACreateInfoRec(void);
68   Bool XAAInit(ScreenPtr, XAAInfoRecPtr);
69   void XAADestroyInfoRec(XAAInfoRec);
70
71   The driver informs XAA of it's acceleration capablities by
72filling out an XAAInfoRec structure and passing it to XAAInit().
73The XAAInfoRec structure contains many fields, most of which are
74function pointers and flags.  Each primitive will typically have
75two functions and a set of flags associated with it, but it may
76have more.  These two functions are the "SetupFor" and "Subsequent" 
77functions.  The "SetupFor" function tells the driver that the 
78hardware should be initialized for a particular type of graphics 
79operation.  After the "SetupFor" function, one or more calls to the 
80"Subsequent" function will be made to indicate that an instance
81of the particular primitive should be rendered by the hardware.
82The details of each instance (width, height, etc...) are given
83with each "Subsequent" function.   The set of flags associated
84with each primitive lets the driver tell XAA what its hardware
85limitations are (eg. It doesn't support a planemask, it can only
86do one of the raster-ops, etc...).
87
88  Of the XAAInfoRec fields, one is required.  This is the
89Sync function.  XAA initialization will fail if this function
90is not provided.
91
92void Sync(ScrnInfoPtr pScrn)			/* Required */
93
94   Sync will be called when XAA needs to be certain that all
95   graphics coprocessor operations are finished, such as when
96   the framebuffer must be written to or read from directly
97   and it must be certain that the accelerator will not be
98   overwriting the area of interest.
99
100   One needs to make certain that the Sync function not only
101   waits for the accelerator fifo to empty, but that it waits for
102   the rendering of that last operation to complete.
103
104   It is guaranteed that no direct framebuffer access will
105   occur after a "SetupFor" or "Subsequent" function without
106   the Sync function being called first.
107
108
109
1102)  THE PRIMITIVES
111
1122.0  Generic Flags
113
114  Each primitive type has a set of flags associated with it which
115allow the driver to tell XAA what the hardware limitations are.
116The common ones are as follows:
117
118/* Foreground, Background, rop and planemask restrictions */
119
120   GXCOPY_ONLY
121
122     This indicates that the accelerator only supports GXcopy
123     for the particular primitive.
124
125   ROP_NEEDS_SOURCE
126
127     This indicates that the accelerator doesn't supports a
128     particular primitive with rops that don't involve the source.
129     These rops are GXclear, GXnoop, GXinvert and GXset. If neither
130     this flag nor GXCOPY_ONLY is defined, it is assumed that the
131     accelerator supports all 16 raster operations (rops) for that
132     primitive.
133
134   NO_PLANEMASK
135
136     This indicates that the accelerator does not support a hardware
137     write planemask for the particular primitive.
138
139   RGB_EQUAL
140
141     This indicates that the particular primitive requires the red, 
142     green and blue bytes of the foreground color (and background color,
143     if applicable) to be equal. This is useful for 24bpp when a graphics
144     coprocessor is used in 8bpp mode, which is not uncommon in older
145     hardware since some have no support for or only limited support for 
146     acceleration at 24bpp. This way, many operations will be accelerated 
147     for the common case of "grayscale" colors.  This flag should only
148     be used in 24bpp.
149
150  In addition to the common ones listed above which are possible for
151nearly all primitives, each primitive may have its own flags specific
152to that primitive.  If such flags exist they are documented in the
153descriptions of those primitives below.
154 
155
156
157
1582.1  Screen to Screen Copies
159
160   The SetupFor and Subsequent ScreenToScreenCopy functions provide
161   an interface for copying rectangular areas from video memory to
162   video memory.  To accelerate this primitive the driver should
163   provide both the SetupFor and Subsequent functions and indicate
164   the hardware restrictions via the ScreenToScreenCopyFlags.  The
165   NO_PLANEMASK, GXCOPY_ONLY and ROP_NEEDS_SOURCE flags as described
166   in Section 2.0 are valid as well as the following:
167
168    NO_TRANSPARENCY
169     
170      This indicates that the accelerator does not support skipping
171      of color keyed pixels when copying from the source to the destination.
172
173    TRANSPARENCY_GXCOPY_ONLY
174
175      This indicates that the accelerator supports skipping of color keyed
176      pixels only when the rop is GXcopy.
177
178    ONLY_LEFT_TO_RIGHT_BITBLT
179
180      This indicates that the hardware only accepts blitting when the
181      x direction is positive.
182
183    ONLY_TWO_BITBLT_DIRECTIONS
184
185      This indicates that the hardware can only cope with blitting when
186      the direction of x is the same as the direction in y.
187
188
189void SetupForScreenToScreenCopy( ScrnInfoPtr pScrn,
190			int xdir, int ydir,
191			int rop,
192			unsigned int planemask,
193			int trans_color )
194
195    When this is called, SubsequentScreenToScreenCopy will be called
196    one or more times directly after.  If ydir is 1, then the accelerator
197    should copy starting from the top (minimum y) of the source and
198    proceed downward.  If ydir is -1, then the accelerator should copy
199    starting from the bottom of the source (maximum y) and proceed
200    upward.  If xdir is 1, then the accelerator should copy each
201    y scanline starting from the leftmost pixel of the source.  If
202    xdir is -1, it should start from the rightmost pixel.  
203       If trans_color is not -1 then trans_color indicates that the
204    accelerator should not copy pixels with the color trans_color
205    from the source to the destination, but should skip them. 
206    Trans_color is always -1 if the NO_TRANSPARENCY flag is set.
207 
208
209void SubsequentScreenToScreenCopy(ScrnInfoPtr pScrn,
210			int x1, int y1,
211			int x2, int y2,
212			int width, int height)
213
214    Copy a rectangle "width" x "height" from the source (x1,y1) to the 
215    destination (x2,y2) using the parameters passed by the last
216    SetupForScreenToScreenCopy call. (x1,y1) and (x2,y2) always denote 
217    the upper left hand corners of the source and destination regardless 
218    of which xdir and ydir values are given by SetupForScreenToScreenCopy.  
219
220
221
2222.2 Solid Fills
223
224   The SetupFor and Subsequent SolidFill(Rect/Trap) functions provide
225   an interface for filling rectangular areas of the screen with a
226   foreground color.  To accelerate this primitive the driver should
227   provide both the SetupForSolidFill and SubsequentSolidFillRect 
228   functions and indicate the hardware restrictions via the SolidFillFlags.
229   The driver may optionally provide a SubsequentSolidFillTrap if
230   it is capable of rendering the primitive correctly.  
231   The GXCOPY_ONLY, ROP_NEEDS_SOURCE, NO_PLANEMASK and RGB_EQUAL flags
232   as described in Section 2.0 are valid.
233
234  
235void SetupForSolidFill(ScrnInfoPtr pScrn, 
236                       int color, int rop, unsigned int planemask)
237
238    SetupForSolidFill indicates that any combination of the following 
239    may follow it.
240
241	SubsequentSolidFillRect
242	SubsequentSolidFillTrap
243
244
245 
246void SubsequentSolidFillRect(ScrnInfoPtr pScrn, int x, int y, int w, int h)
247
248     Fill a rectangle of dimensions "w" by "h" with origin at (x,y) 
249     using the color, rop and planemask given by the last 
250     SetupForSolidFill call.
251
252void SubsequentSolidFillTrap(ScrnInfoPtr pScrn, int y, int h, 
253	int left, int dxL, int dyL, int eL,
254	int right, int dxR, int dyR, int eR)
255
256     These parameters describe a trapezoid via a version of
257     Bresenham's parameters. "y" is the top line. "h" is the
258     number of spans to be filled in the positive Y direction.
259     "left" and "right" indicate the starting X values of the
260     left and right edges.  dy/dx describes the edge slope.
261     These are not the deltas between the beginning and ending
262     points on an edge.  They merely describe the slope. "e" is
263     the initial error term.  It's the relationships between dx,
264     dy and e that define the edge.
265	If your engine does not do bresenham trapezoids or does
266     not allow the programmer to specify the error term then
267     you are not expected to be able to accelerate them.
268
269
2702.3  Solid Lines
271
272    XAA provides an interface for drawing thin lines.  In order to
273    draw X lines correctly a high degree of accuracy is required.
274    This usually limits line acceleration to hardware which has a
275    Bresenham line engine, though depending on the algorithm used,
276    other line engines may come close if they accept 16 bit line 
277    deltas.  XAA has both a Bresenham line interface and a two-point
278    line interface for drawing lines of arbitrary orientation.  
279    Additionally there is a SubsequentSolidHorVertLine which will
280    be used for all horizontal and vertical lines.  Horizontal and
281    vertical lines are handled separately since hardware that doesn't
282    have a line engine (or has one that is unusable due to precision
283    problems) can usually draw these lines by some other method such
284    as drawing them as thin rectangles.  Even for hardware that can
285    draw arbitrary lines via the Bresenham or two-point interfaces,
286    the SubsequentSolidHorVertLine is used for horizontal and vertical
287    lines since most hardware is able to render the horizontal lines
288    and sometimes the vertical lines faster by other methods (Hint:
289    try rendering horizontal lines as flattened rectangles).  If you have 
290    not provided a SubsequentSolidHorVertLine but you have provided 
291    Bresenham or two-point lines, a SubsequentSolidHorVertLine function 
292    will be supplied for you.
293
294    The flags field associated with Solid Lines is SolidLineFlags and 
295    the GXCOPY_ONLY, ROP_NEEDS_SOURCE, NO_PLANEMASK and RGB_EQUAL flags as
296    described in Section 2.0 are valid restrictions.  
297
298    Some line engines have line biases hardcoded to comply with
299    Microsoft line biasing rules.  A tell-tale sign of this is the
300    hardware lines not matching the software lines in the zeroth and
301    fourth octants.  The driver can set the flag:
302	
303	MICROSOFT_ZERO_LINE_BIAS
304
305    in the AccelInfoRec.Flags field to adjust the software lines to
306    match the hardware lines.   This is in the generic flags field
307    rather than the SolidLineFlags since this flag applies to all
308    software zero-width lines on the screen and not just the solid ones.
309
310
311void SetupForSolidLine(ScrnInfoPtr pScrn, 
312                       int color, int rop, unsigned int planemask)
313
314    SetupForSolidLine indicates that any combination of the following 
315    may follow it.
316
317	SubsequentSolidBresenhamLine
318	SubsequentSolidTwoPointLine
319        SubsequentSolidHorVertLine 	
320
321
322void SubsequentSolidHorVertLine( ScrnInfoPtr pScrn,
323        			int x, int y, int len, int dir )
324
325    All vertical and horizontal solid thin lines are rendered with
326    this function.  The line starts at coordinate (x,y) and extends
327    "len" pixels inclusive.  In the direction indicated by "dir."
328    The direction is either DEGREES_O or DEGREES_270.  That is, it
329    always extends to the right or down.
330
331
332
333void SubsequentSolidTwoPointLine(ScrnInfoPtr pScrn,
334        	int x1, int y1, int x2, int y2, int flags)
335
336    Draw a line from (x1,y1) to (x2,y2).  If the flags field contains
337    the flag OMIT_LAST, the last pixel should not be drawn.  Otherwise,
338    the pixel at (x2,y2) should be drawn.
339
340    If you use the TwoPoint line interface there is a good possibility
341    that your line engine has hard-coded line biases that do not match
342    the default X zero-width lines.  If so, you may need to set the
343    MICROSOFT_ZERO_LINE_BIAS flag described above.  Note that since
344    any vertex in the 16-bit signed coordinate system is valid, your
345    line engine is expected to handle 16-bit values if you have hardware
346    line clipping enabled.  If your engine cannot handle 16-bit values,
347    you should not use hardware line clipping.
348
349
350void SubsequentSolidBresenhamLine(ScrnInfoPtr pScrn,
351        int x, int y, int major, int minor, int err, int len, int octant)
352
353    "X" and "y" are the starting point of the line.  "Major" and "minor" 
354    are the major and minor step constants.  "Err" is the initial error
355    term.  "Len" is the number of pixels to be drawn (inclusive). "Octant"
356    can be any combination of the following flags OR'd together:
357
358      Y_MAJOR		Y is the major axis (X otherwise)
359      X_DECREASING	The line is drawn from right to left
360      Y_DECREASING	The line is drawn from bottom to top
361	  
362    The major, minor and err terms are the "raw" Bresenham parameters
363    consistent with a line engine that does:
364
365	e = err;
366	while(len--) {
367	   DRAW_POINT(x,y);
368	   e += minor;
369	   if(e >= 0) {
370		e -= major;
371		TAKE_ONE_STEP_ALONG_MINOR_AXIS;
372	   }
373	   TAKE_ONE_STEP_ALONG_MAJOR_AXIS;
374	}
375
376    IBM 8514 style Bresenham line interfaces require their parameters
377    modified in the following way:
378
379	Axial = minor;
380	Diagonal = minor - major;
381	Error = minor + err;
382
383SolidBresenhamLineErrorTermBits
384
385    This field allows the driver to tell XAA how many bits large its
386    Bresenham parameter registers are.  Many engines have registers that
387    only accept 12 or 13 bit Bresenham parameters, and the parameters
388    for clipped lines may overflow these if they are not scaled down.
389    If this field is not set, XAA will assume the engine can accomodate
390    16 bit parameters, otherwise, it will scale the parameters to the
391    size specified.
392
393
3942.4  Dashed Lines
395
396    The same degree of accuracy required by the solid lines is required
397    for drawing dashed lines as well.  The dash pattern itself is a
398    buffer of binary data where ones are expanded into the foreground
399    color and zeros either correspond to the background color or
400    indicate transparency depending on whether or not DoubleDash or
401    OnOffDashes are being drawn.  
402
403    The flags field associated with dashed Lines is DashedLineFlags and 
404    the GXCOPY_ONLY, ROP_NEEDS_SOURCE, NO_PLANEMASK and RGB_EQUAL flags as
405    described in Section 2.0 are valid restrictions.  Additionally, the
406    following flags are valid:
407
408      NO_TRANSPARENCY
409
410	This indicates that the driver cannot support dashed lines
411	with transparent backgrounds (OnOffDashes).
412
413      TRANSPARENCY_ONLY
414
415	This indicates that the driver cannot support dashes with
416	both a foreground and background color (DoubleDashes).
417
418      LINE_PATTERN_POWER_OF_2_ONLY
419
420	This indicates that only patterns with a power of 2 length
421	can be accelerated.
422
423      LINE_PATTERN_LSBFIRST_MSBJUSTIFIED
424      LINE_PATTERN_LSBFIRST_LSBJUSTIFIED
425      LINE_PATTERN_MSBFIRST_MSBJUSTIFIED
426      LINE_PATTERN_MSBFIRST_LSBJUSTIFIED
427
428	These describe how the line pattern should be packed.
429	The pattern buffer is DWORD padded.  LSBFIRST indicates
430	that the pattern runs from the LSB end to the MSB end.
431	MSBFIRST indicates that the pattern runs from the MSB end
432	to the LSB end.  When the pattern does not completely fill
433	the DWORD padded buffer, the pattern will be justified 
434	towards the MSB or LSB end based on the flags above.
435
436
437    The following field indicates the maximum length dash pattern that
438    should be accelerated.
439
440	int DashPatternMaxLength
441
442
443void SetupForDashedLine(ScrnInfoPtr pScrn,
444		int fg, int bg, int rop, unsigned int planemask,
445        	int length, unsigned char *pattern)
446
447    
448    SetupForDashedLine indicates that any combination of the following 
449    may follow it.
450
451	SubsequentDashedBresenhamLine
452	SubsequentDashedTwoPointLine
453
454    If "bg" is -1, then the background (pixels corresponding to clear
455    bits in the pattern) should remain unmodified. "Bg" indicates the
456    background color otherwise.  "Length" indicates the length of
457    the pattern in bits and "pattern" points to the DWORD padded buffer
458    holding the pattern which has been packed according to the flags
459    set above.  
460
461    
462void SubsequentDashedTwoPointLine( ScrnInfoPtr pScrn,
463        int x1, int y1, int x2, int y2, int flags, int phase)
464
465void SubsequentDashedBresenhamLine(ScrnInfoPtr pScrn,
466        int x1, int y1, int major, int minor, int err, int len, int octant,
467        int phase)
468  
469    These are the same as the SubsequentSolidTwoPointLine and
470    SubsequentBresenhamLine functions except for the addition
471    of the "phase" field which indicates the offset into the dash 
472    pattern that the pixel at (x1,y1) corresponds to.
473
474    As with the SubsequentBresenhamLine, there is an
475 
476	int DashedBresenhamLineErrorTermBits 
477   
478    field which indicates the size of the error term registers
479    used with dashed lines.  This is usually the same value as
480    the field for the solid lines (because it's usually the same
481    register).
482       
483      
484
4852.5   Color Expansion Fills
486
487    When filling a color expansion rectangle, the accelerator
488    paints each pixel depending on whether or not a bit in a
489    corresponding bitmap is set or clear. Opaque expansions are 
490    when a set bit corresponds to the foreground color and a clear 
491    bit corresponds to the background color.  A transparent expansion
492    is when a set bit corresponds to the foreground color and a
493    clear bit indicates that the pixel should remain unmodified.
494   
495    The graphics accelerator usually has access to the source 
496    bitmap in one of two ways: 1) the bitmap data is sent serially
497    to the accelerator by the CPU through some memory mapped aperture
498    or 2) the accelerator reads the source bitmap out of offscreen
499    video memory.  Some types of primitives are better suited towards 
500    one method or the other.  Type 2 is useful for reusable patterns
501    such as stipples which can be cached in offscreen memory.  The
502    aperature method can be used for stippling but the CPU must pass
503    the data across the bus each time a stippled fill is to be performed.  
504    For expanding 1bpp client pixmaps or text strings to the screen,
505    the aperature method is usually superior because the intermediate
506    copy in offscreen memory needed by the second method would only be 
507    used once.  Unfortunately, many accelerators can only do one of these
508    methods and not both.  
509
510    XAA provides both ScreenToScreen and CPUToScreen color expansion 
511    interfaces for doing color expansion fills.  The ScreenToScreen
512    functions can only be used with hardware that supports reading
513    of source bitmaps from offscreen video memory, and these are only
514    used for cacheable patterns such as stipples.  There are two
515    variants of the CPUToScreen routines - a direct method intended
516    for hardware that has a transfer aperature, and an indirect method
517    intended for hardware without transfer aperatures or hardware
518    with unusual transfer requirements.  Hardware that can only expand
519    bitmaps from video memory should supply ScreenToScreen routines
520    but also ScanlineCPUToScreen (indirect) routines to optimize transfers 
521    of non-cacheable data.  Hardware that can only accept source bitmaps
522    through an aperature should supply CPUToScreen (or ScanlineCPUToScreen) 
523    routines. Hardware that can do both should provide both ScreenToScreen 
524    and CPUToScreen routines.
525
526    For both ScreenToScreen and CPUToScreen interfaces, the GXCOPY_ONLY,
527    ROP_NEEDS_SOURCE, NO_PLANEMASK and RGB_EQUAL flags described in
528    Section 2.0 are valid as well as the following:
529
530    /* bit order requirements (one of these must be set) */
531   
532    BIT_ORDER_IN_BYTE_LSBFIRST
533
534      This indicates that least significant bit in each byte of the source
535      data corresponds to the leftmost of that block of 8 pixels.  This
536      is the prefered format.
537
538    BIT_ORDER_IN_BYTE_MSBFIRST    
539
540      This indicates that most significant bit in each byte of the source
541      data corresponds to the leftmost of that block of 8 pixels.
542
543    /* transparency restrictions */
544
545    NO_TRANSPARENCY
546
547      This indicates that the accelerator cannot do a transparent expansion.
548
549    TRANSPARENCY_ONLY
550
551      This indicates that the accelerator cannot do an opaque expansion.
552      In cases where where the background needs to be filled, XAA will
553      render the primitive in two passes when using the CPUToScreen
554      interface, but will not do so with the ScreenToScreen interface 
555      since that would require caching of two patterns.  Some 
556      ScreenToScreen hardware may be able to render two passes at the
557      driver level and remove the TRANSPARENCY_ONLY restriction if
558      it can render pixels corresponding to the zero bits.
559
560
561
5622.5.1  Screen To Screen Color Expansion
563
564    The ScreenToScreenColorExpandFill routines provide an interface
565    for doing expansion blits from source patterns stored in offscreen
566    video memory.
567
568    void SetupForScreenToScreenColorExpandFill (ScrnInfoPtr pScrn,
569        			int fg, int bg, 
570				int rop, unsigned int planemask)
571
572
573    Ones in the source bitmap will correspond to the fg color.
574    Zeros in the source bitmap will correspond to the bg color
575    unless bg = -1.  In that case the pixels corresponding to the
576    zeros in the bitmap shall be left unmodified by the accelerator.
577
578    For hardware that doesn't allow an easy implementation of skipleft, the
579    driver can replace CacheMonoStipple function with one that stores multiple
580    rotated copies of the stipple and select between them. In this case the
581    driver should set CacheColorExpandDensity to tell XAA how many copies of
582    the pattern are stored in the width of a cache slot. For instance if the
583    hardware can specify the starting address in bytes, then 8 rotated copies
584    of the stipple are needed and CacheColorExpandDensity should be set to 8.
585
586    void SubsequentScreenToScreenColorExpandFill( ScrnInfoPtr pScrn,
587				int x, int y, int w, int h,
588				int srcx, int srcy, int offset )
589
590   
591    Fill a rectangle "w" x "h" at location (x,y).  The source pitch
592    between scanlines is the framebuffer pitch (pScrn->displayWidth
593    pixels) and srcx and srcy indicate the start of the source pattern 
594    in units of framebuffer pixels. "Offset" indicates the bit offset
595    into the pattern that corresponds to the pixel being painted at
596    "x" on the screen.  Some hardware accepts source coordinates in
597    units of bits which makes implementation of the offset trivial.
598    In that case, the bit address of the source bit corresponding to
599    the pixel painted at (x,y) would be:
600	
601     (srcy * pScrn->displayWidth + srcx) * pScrn->bitsPerPixel + offset
602
603    It should be noted that the offset assumes LSBFIRST hardware.  
604    For MSBFIRST hardware, the driver may need to implement the 
605    offset by bliting only from byte boundaries and hardware clipping.
606
607
608
6092.5.2  CPU To Screen Color Expansion
610
611
612    The CPUToScreenColorExpandFill routines provide an interface for 
613    doing expansion blits from source patterns stored in system memory.
614    There are two varieties of this primitive, a CPUToScreenColorExpandFill
615    and a ScanlineCPUToScreenColorExpandFill.  With the 
616    CPUToScreenColorExpandFill method, the source data is sent serially
617    through a memory mapped aperature.  With the Scanline version, the
618    data is rendered scanline at a time into intermediate buffers with
619    a call to SubsequentColorExpandScanline following each scanline.
620
621    These two methods have separate flags fields, the
622    CPUToScreenColorExpandFillFlags and ScanlineCPUToScreenColorExpandFillFlags
623    respectively.  Flags specific to one method or the other are described 
624    in sections 2.5.2.1 and 2.5.2.2 but for both cases the bit order and
625    transparency restrictions listed at the beginning of section 2.5 are 
626    valid as well as the following:
627    
628    /* clipping  (optional) */
629    
630    LEFT_EDGE_CLIPPING
631 
632      This indicates that the accelerator supports omission of up to
633      31 pixels on the left edge of the rectangle to be filled.  This
634      is beneficial since it allows transfer of the source bitmap to
635      always occur from DWORD boundaries. 
636
637    LEFT_EDGE_CLIPPING_NEGATIVE_X
638
639      This flag indicates that the accelerator can render color expansion
640      rectangles even if the value of x origin is negative (off of
641      the screen on the left edge).
642
643    /* misc */
644
645    TRIPLE_BITS_24BPP
646
647      When enabled (must be in 24bpp mode), color expansion functions
648      are expected to require three times the amount of bits to be
649      transferred so that 24bpp grayscale colors can be used with color
650      expansion in 8bpp coprocessor mode. Each bit is expanded to 3
651      bits when writing the monochrome data.
652
653
654 2.5.1 The Direct Method 
655
656
657    Using the direct method of color expansion XAA will send all
658    bitmap data to the accelerator serially through an memory mapped
659    transfer window defined by the following two fields:
660
661      unsigned char *ColorExpandBase
662
663        This indicates the memory address of the beginning of the aperture.
664
665      int ColorExpandRange
666
667        This indicates the size in bytes of the aperture.
668
669    The driver should specify how the transfered data should be padded.
670    There are options for both the padding of each Y scanline and for the
671    total transfer to the aperature.
672    One of the following two flags must be set:
673
674      CPU_TRANSFER_PAD_DWORD
675
676        This indicates that the total transfer (sum of all scanlines) sent
677        to the aperature must be DWORD padded.  This is the default behavior.
678
679      CPU_TRANSFER_PAD_QWORD 
680
681	This indicates that the total transfer (sum of all scanlines) sent
682	to the aperature must be QWORD padded.  With this set, XAA will send
683        an extra DWORD to the aperature when needed to ensure that only
684        an even number of DWORDs are sent.
685
686    And then there are the flags for padding of each scanline:
687
688      SCANLINE_PAD_DWORD
689
690	This indicates that each Y scanline should be DWORD padded.
691        This is the only option available and is the default.
692
693    Finally, there is the CPU_TRANSFER_BASE_FIXED flag which indicates
694    that the aperture is a single register rather than a range of
695    registers, and XAA should write all of the data to the first DWORD.
696    If the ColorExpandRange is not large enough to accomodate scanlines
697    the width of the screen, this option will be forced. That is, the
698    ColorExpandRange must be:
699
700        ((virtualX + 31)/32) * 4   bytes or more.
701
702        ((virtualX + 62)/32 * 4) if LEFT_EDGE_CLIPPING_NEGATIVE_X is set.
703  
704    If the TRIPLE_BITS_24BPP flag is set, the required area should be 
705    multiplied by three.
706     
707    
708void SetupForCPUToScreenColorExpandFill(ScrnInfoPtr pScrn,
709        		int fg, int bg,
710			int rop,
711			unsigned int planemask)
712
713  
714 
715     Ones in the source bitmap will correspond to the fg color.
716     Zeros in the source bitmap will correspond to the bg color
717     unless bg = -1.  In that case the pixels corresponding to the
718     zeros in the bitmap shall be left unmodified by the accelerator.
719
720
721void SubsequentCPUToScreenColorExpandFill(ScrnInfoPtr pScrn,
722			int x, int y, int w, int h,
723			int skipleft )
724
725     When this function is called, the accelerator should be setup
726     to fill a rectangle of dimension "w" by "h" with origin at (x,y)
727     in the fill style prescribed by the last call to 
728     SetupForCPUToScreenColorExpandFill.  XAA will pass the data to 
729     the aperture immediately after this function is called.  If the 
730     skipleft is non-zero (and LEFT_EDGE_CLIPPING has been enabled), then 
731     the accelerator _should_not_ render skipleft pixels on the leftmost
732     edge of the rectangle.  Some engines have an alignment feature
733     like this built in, some others can do this using a clipping
734     window.
735
736     It can be arranged for XAA to call Sync() after it is through 
737     calling the Subsequent function by setting SYNC_AFTER_COLOR_EXPAND 
738     in the  CPUToScreenColorExpandFillFlags.  This can provide the driver 
739     with an oportunity to reset a clipping window if needed.
740
741    
7422.5.2  The Indirect Method 
743
744     Using the indirect method, XAA will render the bitmap data scanline
745     at a time to one or more buffers.  These buffers may be memory
746     mapped apertures or just intermediate storage.
747
748     int NumScanlineColorExpandBuffers
749
750       This indicates the number of buffers available.
751
752     unsigned char **ScanlineColorExpandBuffers
753
754       This is an array of pointers to the memory locations of each buffer.
755       Each buffer is expected to be large enough to accommodate scanlines
756       the width of the screen.  That is:
757
758        ((virtualX + 31)/32) * 4   bytes or more.
759
760        ((virtualX + 62)/32 * 4) if LEFT_EDGE_CLIPPING_NEGATIVE_X is set.
761  
762     Scanlines are always DWORD padded.
763     If the TRIPLE_BITS_24BPP flag is set, the required area should be 
764     multiplied by three.
765
766
767void SetupForScanlineCPUToScreenColorExpandFill(ScrnInfoPtr pScrn,
768        		int fg, int bg,
769			int rop,
770			unsigned int planemask)
771 
772     Ones in the source bitmap will correspond to the fg color.
773     Zeros in the source bitmap will correspond to the bg color
774     unless bg = -1.  In that case the pixels corresponding to the
775     zeros in the bitmap shall be left unmodified by the accelerator.
776
777     
778void SubsequentScanlineCPUToScreenColorExpandFill(ScrnInfoPtr pScrn,
779			int x, int y, int w, int h,
780			int skipleft )
781
782void SubsequentColorExpandScanline(ScrnInfoPtr pScrn, int bufno)
783
784
785    When SubsequentScanlineCPUToScreenColorExpandFill is called, XAA 
786    will begin transfering the source data scanline at a time, calling  
787    SubsequentColorExpandScanline after each scanline.  If more than
788    one buffer is available, XAA will cycle through the buffers.
789    Subsequent scanlines will use the next buffer and go back to the
790    buffer 0 again when the last buffer is reached.  The index into
791    the ScanlineColorExpandBuffers array is presented as "bufno"
792    with each SubsequentColorExpandScanline call.
793
794    The skipleft field is the same as for the direct method.
795
796    The indirect method can be use to send the source data directly 
797    to a memory mapped aperture represented by a single color expand
798    buffer, scanline at a time, but more commonly it is used to place 
799    the data into offscreen video memory so that the accelerator can 
800    blit it to the visible screen from there.  In the case where the
801    accelerator permits rendering into offscreen video memory while
802    the accelerator is active, several buffers can be used so that
803    XAA can be placing source data into the next buffer while the
804    accelerator is blitting the current buffer.  For cases where
805    the accelerator requires some special manipulation of the source
806    data first, the buffers can be in system memory.  The CPU can
807    manipulate these buffers and then send the data to the accelerator.
808
809
810
8112.6   8x8 Mono Pattern Fills
812
813    XAA provides support for two types of 8x8 hardware patterns -
814    "Mono" patterns and "Color" patterns.  Mono pattern data is
815    64 bits of color expansion data with ones indicating the
816    foreground color and zeros indicating the background color.
817    The source bitmaps for the 8x8 mono patterns can be presented
818    to the graphics accelerator in one of two ways.  They can be
819    passed as two DWORDS to the 8x8 mono pattern functions or
820    they can be cached in offscreen memory and their locations
821    passed to the 8x8 mono pattern functions.  In addition to the
822    GXCOPY_ONLY, ROP_NEEDS_SOURCE, NO_PLANEMASK and RGB_EQUAL flags
823    defined in Section 2.0, the following are defined for the
824    Mono8x8PatternFillFlags:
825
826    HARDWARE_PATTERN_PROGRAMMED_BITS
827
828      This indicates that the 8x8 patterns should be packed into two
829      DWORDS and passed to the 8x8 mono pattern functions.  The default
830      behavior is to cache the patterns in offscreen video memory and
831      pass the locations of these patterns to the functions instead.
832      The pixmap cache must be enabled for the default behavior (8x8 
833      pattern caching) to work.  See Section 3 for how to enable the
834      pixmap cache. The pixmap cache is not necessary for 
835      HARDWARE_PATTERN_PROGRAMMED_BITS.
836
837    HARDWARE_PATTERN_PROGRAMMED_ORIGIN
838
839      If the hardware supports programmable pattern offsets then
840      this option should be set. See the table below for further
841      infomation.
842
843    HARDWARE_PATTERN_SCREEN_ORIGIN
844
845      Some hardware wants the pattern offset specified with respect to the
846      upper left-hand corner of the primitive being drawn.  Other hardware 
847      needs the option HARDWARE_PATTERN_SCREEN_ORIGIN set to indicate that 
848      all pattern offsets should be referenced to the upper left-hand 
849      corner of the screen.  HARDWARE_PATTERN_SCREEN_ORIGIN is preferable 
850      since this is more natural for the X-Window system and offsets will 
851      have to be recalculated for each Subsequent function otherwise.
852
853    BIT_ORDER_IN_BYTE_MSBFIRST
854    BIT_ORDER_IN_BYTE_LSBFIRST
855
856      As with other color expansion routines this indicates whether the
857      most or the least significant bit in each byte from the pattern is 
858      the leftmost on the screen.
859
860    TRANSPARENCY_ONLY
861    NO_TRANSPARENCY
862
863      This means the same thing as for the color expansion rect routines
864      except that for TRANSPARENCY_ONLY XAA will not render the primitive
865      in two passes since this is more easily handled by the driver.
866      It is recommended that TRANSPARENCY_ONLY hardware handle rendering
867      of opaque patterns in two passes (the background can be filled as
868      a rectangle in GXcopy) in the Subsequent function so that the
869      TRANSPARENCY_ONLY restriction can be removed. 
870
871
872
873    Additional information about cached patterns...
874    For the case where HARDWARE_PATTERN_PROGRAMMED_BITS is not set and 
875    the pattern must be cached in offscreen memory, the first pattern
876    starts at the cache slot boundary which is set by the 
877    CachePixelGranularity field used to configure the pixmap cache.
878    One should ensure that the CachePixelGranularity reflects any 
879    alignment restrictions that the accelerator may put on 8x8 pattern 
880    storage locations.  When HARDWARE_PATTERN_PROGRAMMED_ORIGIN is set 
881    there is only one pattern stored.  When this flag is not set,
882    all 64 pre-rotated copies of the pattern are cached in offscreen memory.
883    The MonoPatternPitch field can be used to specify the X position pixel
884    granularity that each of these patterns must align on.  If the
885    MonoPatternPitch is not supplied, the patterns will be densely packed
886    within the cache slot.  The behavior of the default XAA 8x8 pattern
887    caching mechanism to store all 8x8 patterns linearly in video memory.
888    If the accelerator needs the patterns stored in a more unusual fashion,
889    the driver will need to provide its own 8x8 mono pattern caching 
890    routines for XAA to use. 
891
892    The following table describes the meanings of the "patx" and "paty"
893    fields in both the SetupFor and Subsequent functions.
894
895    With HARDWARE_PATTERN_SCREEN_ORIGIN
896    -----------------------------------
897
898    HARDWARE_PATTERN_PROGRAMMED_BITS and HARDWARE_PATTERN_PROGRAMMED_ORIGIN
899
900	SetupFor: patx and paty are the first and second DWORDS of the
901		  8x8 mono pattern.
902
903	Subsequent: patx and paty are the x,y offset into that pattern.
904		    All Subsequent calls will have the same offset in 
905		    the case of HARDWARE_PATTERN_SCREEN_ORIGIN so only
906		    the offset specified by the first Subsequent call 
907		    after a SetupFor call will need to be observed.
908
909    HARDWARE_PATTERN_PROGRAMMED_BITS only
910
911	SetupFor: patx and paty hold the first and second DWORDS of
912		  the 8x8 mono pattern pre-rotated to match the desired
913		  offset.
914
915	Subsequent: These just hold the same patterns and can be ignored.
916
917    HARDWARE_PATTERN_PROGRAMMED_ORIGIN only
918
919	SetupFor: patx and paty hold the x,y coordinates of the offscreen
920		  memory location where the 8x8 pattern is stored.  The
921		  bits are stored linearly in memory at that location.
922
923	Subsequent: patx and paty hold the offset into the pattern.
924		    All Subsequent calls will have the same offset in 
925		    the case of HARDWARE_PATTERN_SCREEN_ORIGIN so only
926		    the offset specified by the first Subsequent call 
927		    after a SetupFor call will need to be observed.
928
929    Neither programmed bits or origin
930
931	SetupFor: patx and paty hold the x,y coordinates of the offscreen 	
932		  memory location where the pre-rotated 8x8 pattern is
933		  stored.
934
935	Subsequent: patx and paty are the same as in the SetupFor function
936		    and can be ignored.
937		  
938
939    Without HARDWARE_PATTERN_SCREEN_ORIGIN
940    -------------------------------------- 
941
942    HARDWARE_PATTERN_PROGRAMMED_BITS and HARDWARE_PATTERN_PROGRAMMED_ORIGIN
943
944	SetupFor: patx and paty are the first and second DWORDS of the
945		  8x8 mono pattern.
946
947	Subsequent: patx and paty are the x,y offset into that pattern.
948
949    HARDWARE_PATTERN_PROGRAMMED_BITS only
950
951	SetupFor: patx and paty holds the first and second DWORDS of
952		  the unrotated 8x8 mono pattern.  This can be ignored. 
953
954	Subsequent: patx and paty hold the rotated 8x8 pattern to be 
955		    rendered.
956
957    HARDWARE_PATTERN_PROGRAMMED_ORIGIN only
958
959	SetupFor: patx and paty hold the x,y coordinates of the offscreen
960		  memory location where the 8x8 pattern is stored.  The
961		  bits are stored linearly in memory at that location.
962
963	Subsequent: patx and paty hold the offset into the pattern.
964
965    Neither programmed bits or origin
966
967	SetupFor: patx and paty hold the x,y coordinates of the offscreen 	
968		  memory location where the unrotated 8x8 pattern is
969		  stored.  This can be ignored.
970
971	Subsequent: patx and paty hold the x,y coordinates of the
972		    rotated 8x8 pattern to be rendered.
973
974
975
976void SetupForMono8x8PatternFill(ScrnInfoPtr pScrn, int patx, int paty,
977        int fg, int bg, int rop, unsigned int planemask)
978
979    SetupForMono8x8PatternFill indicates that any combination of the 
980    following  may follow it.
981
982	SubsequentMono8x8PatternFillRect
983	SubsequentMono8x8PatternFillTrap
984
985    The fg, bg, rop and planemask fields have the same meaning as the
986    ones used for the other color expansion routines.  Patx's and paty's
987    meaning can be determined from the table above.
988
989 
990void SubsequentMono8x8PatternFillRect( ScrnInfoPtr pScrn,
991        	int patx, int paty, int x, int y, int w, int h)
992
993     Fill a rectangle of dimensions "w" by "h" with origin at (x,y) 
994     using the parameters give by the last SetupForMono8x8PatternFill
995     call.  The meanings of patx and paty can be determined by the
996     table above.
997
998void SubsequentMono8x8PatternFillTrap( ScrnInfoPtr pScrn,
999     			   int patx, int paty, int y, int h, 
1000     			   int left, int dxL, int dyL, int eL,
1001     			   int right, int dxR, int dyR, int eR )
1002
1003     The meanings of patx and paty can be determined by the table above.
1004     The rest of the fields have the same meanings as those in the 
1005     SubsequentSolidFillTrap function. 
1006
1007
1008
10092.7   8x8 Color Pattern Fills
1010  
1011    8x8 color pattern data is 64 pixels of full color data that
1012    is stored linearly in offscreen video memory.  8x8 color patterns 
1013    are useful as a substitute for 8x8 mono patterns when tiling,
1014    doing opaque stipples, or in the case where transperency is
1015    supported, regular stipples.  8x8 color pattern fills also have
1016    the additional benefit of being able to tile full color 8x8
1017    patterns instead of just 2 color ones like the mono patterns.
1018    However, full color 8x8 patterns aren't used very often in the
1019    X Window system so you might consider passing this primitive
1020    by if you already can do mono patterns, especially if they 
1021    require alot of cache area.  Color8x8PatternFillFlags is
1022    the flags field for this primitive and the GXCOPY_ONLY,
1023    ROP_NEEDS_SOURCE and NO_PLANEMASK flags as described in
1024    Section 2.0 are valid as well as the following:
1025
1026
1027    HARDWARE_PATTERN_PROGRAMMED_ORIGIN
1028
1029      If the hardware supports programmable pattern offsets then
1030      this option should be set.  
1031
1032    HARDWARE_PATTERN_SCREEN_ORIGIN
1033
1034      Some hardware wants the pattern offset specified with respect to the
1035      upper left-hand corner of the primitive being drawn.  Other hardware 
1036      needs the option HARDWARE_PATTERN_SCREEN_ORIGIN set to indicate that 
1037      all pattern offsets should be referenced to the upper left-hand 
1038      corner of the screen.  HARDWARE_PATTERN_SCREEN_ORIGIN is preferable 
1039      since this is more natural for the X-Window system and offsets will 
1040      have to be recalculated for each Subsequent function otherwise.
1041
1042    NO_TRANSPARENCY
1043    TRANSPARENCY_GXCOPY_ONLY
1044
1045      These mean the same as for the ScreenToScreenCopy functions.
1046
1047
1048    The following table describes the meanings of patx and paty passed
1049    to the SetupFor and Subsequent fields:
1050
1051    HARDWARE_PATTERN_PROGRAMMED_ORIGIN && HARDWARE_PATTERN_SCREEN_ORIGIN
1052	
1053	SetupFor: patx and paty hold the x,y location of the unrotated 
1054		  pattern.
1055
1056	Subsequent: patx and paty hold the pattern offset.  For the case
1057		    of HARDWARE_PATTERN_SCREEN_ORIGIN all Subsequent calls
1058		    have the same offset so only the first call will need
1059		    to be observed.
1060
1061    
1062    HARDWARE_PATTERN_PROGRAMMED_ORIGIN only
1063
1064	SetupFor: patx and paty hold the x,y location of the unrotated
1065		  pattern.
1066
1067	Subsequent: patx and paty hold the pattern offset. 
1068
1069    HARDWARE_PATTERN_SCREEN_ORIGIN
1070
1071	SetupFor: patx and paty hold the x,y location of the rotated pattern.
1072
1073	Subsequent: patx and paty hold the same location as the SetupFor
1074		    function so these can be ignored.
1075
1076    neither flag
1077
1078	SetupFor: patx and paty hold the x,y location of the unrotated
1079		  pattern.  This can be ignored.
1080
1081	Subsequent: patx and paty hold the x,y location of the rotated
1082		    pattern.
1083
1084    Additional information about cached patterns...
1085    All 8x8 color patterns are cached in offscreen video memory so
1086    the pixmap cache must be enabled to use them. The first pattern
1087    starts at the cache slot boundary which is set by the 
1088    CachePixelGranularity field used to configure the pixmap cache.
1089    One should ensure that the CachePixelGranularity reflects any 
1090    alignment restrictions that the accelerator may put on 8x8 pattern 
1091    storage locations.  When HARDWARE_PATTERN_PROGRAMMED_ORIGIN is set 
1092    there is only one pattern stored.  When this flag is not set,
1093    all 64 rotations off the pattern are accessible but it is assumed
1094    that the accelerator is capable of accessing data stored on 8
1095    pixel boundaries.  If the accelerator has stricter alignment 
1096    requirements than this the dirver will need to provide its own 
1097    8x8 color pattern caching routines. 
1098
1099
1100void SetupForColor8x8PatternFill(ScrnInfoPtr pScrn, int patx, int paty,
1101        	int rop, unsigned int planemask, int trans_color)
1102
1103    SetupForColor8x8PatternFill indicates that any combination of the 
1104    following  may follow it.
1105
1106	SubsequentColor8x8PatternFillRect
1107	SubsequentColor8x8PatternFillTrap	(not implemented yet)
1108
1109    For the meanings of patx and paty, see the table above.  Trans_color
1110    means the same as for the ScreenToScreenCopy functions.
1111
1112
1113 
1114void SubsequentColor8x8PatternFillRect( ScrnInfoPtr pScrn,
1115        	int patx, int paty, int x, int y, int w, int h)
1116
1117     Fill a rectangle of dimensions "w" by "h" with origin at (x,y) 
1118     using the parameters give by the last SetupForColor8x8PatternFill
1119     call.  The meanings of patx and paty can be determined by the
1120     table above.
1121
1122void SubsequentColor8x8PatternFillTrap( ScrnInfoPtr pScrn,
1123     			   int patx, int paty, int y, int h, 
1124     			   int left, int dxL, int dyL, int eL,
1125     			   int right, int dxR, int dyR, int eR )
1126
1127    For the meanings of patx and paty, see the table above. 
1128    The rest of the fields have the same meanings as those in the 
1129    SubsequentSolidFillTrap function. 
1130
1131
1132
11332.8  Image Writes
1134
1135    XAA provides a mechanism for transfering full color pixel data from
1136    system memory to video memory through the accelerator.  This is 
1137    useful for dealing with alignment issues and performing raster ops
1138    on the data when writing it to the framebuffer.  As with color
1139    expansion rectangles, there is a direct and indirect method.  The
1140    direct method sends all data through a memory mapped aperature.
1141    The indirect method sends the data to an intermediated buffer scanline 
1142    at a time.
1143
1144    The direct and indirect methods have separate flags fields, the
1145    ImageWriteFlags and ScanlineImageWriteFlags respectively.
1146    Flags specific to one method or the other are described in sections 
1147    2.8.1 and 2.8.2 but for both cases the GXCOPY_ONLY, ROP_NEEDS_SOURCE
1148    and NO_PLANEMASK flags described in Section 2.0 are valid as well as
1149    the following:
1150
1151    NO_GXCOPY
1152
1153      In order to have accelerated image transfers faster than the 
1154      software versions for GXcopy, the engine needs to support clipping,
1155      be using the direct method and have a large enough image transfer
1156      range so that CPU_TRANSFER_BASE_FIXED doesn't need to be set.
1157      If these are not supported, then it is unlikely that transfering
1158      the data through the accelerator will be of any advantage for the
1159      simple case of GXcopy.  In fact, it may be much slower.  For such
1160      cases it's probably best to set the NO_GXCOPY flag so that 
1161      Image writes will only be used for the more complicated rops.
1162
1163    /* transparency restrictions */
1164
1165    NO_TRANSPARENCY
1166     
1167      This indicates that the accelerator does not support skipping
1168      of color keyed pixels when copying from the source to the destination.
1169
1170    TRANSPARENCY_GXCOPY_ONLY
1171
1172      This indicates that the accelerator supports skipping of color keyed
1173      pixels only when the rop is GXcopy.
1174
1175    /* clipping  (optional) */
1176    
1177    LEFT_EDGE_CLIPPING
1178 
1179      This indicates that the accelerator supports omission of up to
1180      3 pixels on the left edge of the rectangle to be filled.  This
1181      is beneficial since it allows transfer from the source pixmap to
1182      always occur from DWORD boundaries. 
1183
1184    LEFT_EDGE_CLIPPING_NEGATIVE_X
1185
1186      This flag indicates that the accelerator can fill areas with
1187      image write data even if the value of x origin is negative (off of
1188      the screen on the left edge).
1189
1190
11912.8.1 The Direct Method
1192
1193    Using the direct method of ImageWrite XAA will send all
1194    bitmap data to the accelerator serially through an memory mapped
1195    transfer window defined by the following two fields:
1196
1197      unsigned char *ImageWriteBase
1198
1199        This indicates the memory address of the beginning of the aperture.
1200
1201      int ImageWriteRange
1202
1203        This indicates the size in bytes of the aperture.
1204
1205    The driver should specify how the transfered data should be padded.
1206    There are options for both the padding of each Y scanline and for the
1207    total transfer to the aperature.
1208    One of the following two flags must be set:
1209
1210      CPU_TRANSFER_PAD_DWORD
1211
1212        This indicates that the total transfer (sum of all scanlines) sent
1213        to the aperature must be DWORD padded.  This is the default behavior.
1214
1215      CPU_TRANSFER_PAD_QWORD 
1216
1217	This indicates that the total transfer (sum of all scanlines) sent
1218	to the aperature must be QWORD padded.  With this set, XAA will send
1219        an extra DWORD to the aperature when needed to ensure that only
1220        an even number of DWORDs are sent.
1221
1222    And then there are the flags for padding of each scanline:
1223
1224      SCANLINE_PAD_DWORD
1225
1226	This indicates that each Y scanline should be DWORD padded.
1227        This is the only option available and is the default.
1228
1229    Finally, there is the CPU_TRANSFER_BASE_FIXED flag which indicates
1230    that the aperture is a single register rather than a range of
1231    registers, and XAA should write all of the data to the first DWORD.
1232    XAA will automatically select CPU_TRANSFER_BASE_FIXED if the 
1233    ImageWriteRange is not large enough to accomodate an entire scanline.   
1234
1235
1236void SetupForImageWrite(ScrnInfoPtr pScrn, int rop, unsigned int planemask,
1237        			int trans_color, int bpp, int depth)
1238
1239     If trans_color is not -1 then trans_color indicates the transparency
1240     color key and pixels with color trans_color passed through the 
1241     aperature should not be transfered to the screen but should be 
1242     skipped.  Bpp and depth indicate the bits per pixel and depth of
1243     the source pixmap.  Trans_color is always -1 if the NO_TRANSPARENCY
1244     flag is set.
1245
1246
1247void SubsequentImageWriteRect(ScrnInfoPtr pScrn, 
1248				int x, int y, int w, int h, int skipleft)
1249
1250     
1251     Data passed through the aperature should be copied to a rectangle
1252     of width "w" and height "h" with origin (x,y).  If LEFT_EDGE_CLIPPING
1253     has been enabled, skipleft will correspond to the number of pixels
1254     on the left edge that should not be drawn.  Skipleft is zero 
1255     otherwise.
1256
1257     It can be arranged for XAA to call Sync() after it is through 
1258     calling the Subsequent functions by setting SYNC_AFTER_IMAGE_WRITE 
1259     in the  ImageWriteFlags.  This can provide the driver with an
1260     oportunity to reset a clipping window if needed.
1261
12622.8.2  The Indirect Method
1263
1264     Using the indirect method, XAA will render the pixel data scanline
1265     at a time to one or more buffers.  These buffers may be memory
1266     mapped apertures or just intermediate storage.
1267
1268     int NumScanlineImageWriteBuffers
1269
1270       This indicates the number of buffers available.
1271
1272     unsigned char **ScanlineImageWriteBuffers
1273
1274       This is an array of pointers to the memory locations of each buffer.
1275       Each buffer is expected to be large enough to accommodate scanlines
1276       the width of the screen.  That is:
1277
1278         pScrn->VirtualX * pScreen->bitsPerPixel/8   bytes or more.
1279
1280       If LEFT_EDGE_CLIPPING_NEGATIVE_X is set, add an additional 4
1281       bytes to that requirement in 8 and 16bpp, 12 bytes in 24bpp.
1282  
1283     Scanlines are always DWORD padded.
1284
1285void SetupForScanlineImageWrite(ScrnInfoPtr pScrn, int rop, 
1286				unsigned int planemask, int trans_color, 
1287				int bpp, int depth)
1288
1289     If trans_color is not -1 then trans_color indicates the transparency
1290     color key and pixels with color trans_color in the buffer should not 
1291     be transfered to the screen but should be skipped.  Bpp and depth 
1292     indicate the bits per pixel and depth of the source bitmap.  
1293     Trans_color is always -1 if the NO_TRANSPARENCY flag is set.
1294
1295
1296void SubsequentImageWriteRect(ScrnInfoPtr pScrn, 
1297				int x, int y, int w, int h, int skipleft)
1298
1299     
1300void SubsequentImageWriteScanline(ScrnInfoPtr pScrn, int bufno)
1301
1302
1303    When SubsequentImageWriteRect is called, XAA will begin
1304    transfering the source data scanline at a time, calling  
1305    SubsequentImageWriteScanline after each scanline.  If more than
1306    one buffer is available, XAA will cycle through the buffers.
1307    Subsequent scanlines will use the next buffer and go back to the
1308    buffer 0 again when the last buffer is reached.  The index into
1309    the ScanlineImageWriteBuffers array is presented as "bufno"
1310    with each SubsequentImageWriteScanline call.
1311
1312    The skipleft field is the same as for the direct method.
1313
1314    The indirect method can be use to send the source data directly 
1315    to a memory mapped aperture represented by a single image write
1316    buffer, scanline at a time, but more commonly it is used to place 
1317    the data into offscreen video memory so that the accelerator can 
1318    blit it to the visible screen from there.  In the case where the
1319    accelerator permits rendering into offscreen video memory while
1320    the accelerator is active, several buffers can be used so that
1321    XAA can be placing source data into the next buffer while the
1322    accelerator is blitting the current buffer.  For cases where
1323    the accelerator requires some special manipulation of the source
1324    data first, the buffers can be in system memory.  The CPU can
1325    manipulate these buffers and then send the data to the accelerator.
1326
1327
13282.9 Clipping
1329
1330    XAA supports hardware clipping rectangles.  To use clipping
1331    in this way it is expected that the graphics accelerator can
1332    clip primitives with verticies anywhere in the 16 bit signed 
1333    coordinate system. 
1334
1335void SetClippingRectangle ( ScrnInfoPtr pScrn,
1336        		int left, int top, int right, int bottom)
1337
1338void DisableClipping (ScrnInfoPtr pScrn)
1339
1340    When SetClippingRectangle is called, all hardware rendering
1341    following it should be clipped to the rectangle specified
1342    until DisableClipping is called.
1343
1344    The ClippingFlags field indicates which operations this sort
1345    of Set/Disable pairing can be used with.  Any of the following
1346    flags may be OR'd together.
1347
1348	HARDWARE_CLIP_SCREEN_TO_SCREEN_COLOR_EXPAND
1349	HARDWARE_CLIP_SCREEN_TO_SCREEN_COPY
1350	HARDWARE_CLIP_MONO_8x8_FILL
1351	HARDWARE_CLIP_COLOR_8x8_FILL
1352	HARDWARE_CLIP_SOLID_FILL
1353	HARDWARE_CLIP_DASHED_LINE
1354	HARDWARE_CLIP_SOLID_LINE
1355
1356
1357
13583)  XAA PIXMAP CACHE
1359
1360   /* NOTE:  XAA has no knowledge of framebuffer particulars so until
1361	the framebuffer is able to render into offscreen memory, usage
1362	of the pixmap cache requires that the driver provide ImageWrite
1363	routines or a WritePixmap or WritePixmapToCache replacement so
1364	that patterns can even be placed in the cache.
1365
1366      ADDENDUM: XAA can now load the pixmap cache without requiring
1367	that the driver supply an ImageWrite function, but this can
1368	only be done on linear framebuffers.  If you have a linear
1369	framebuffer, set LINEAR_FRAMEBUFFER in the XAAInfoRec.Flags
1370	field and XAA will then be able to upload pixmaps into the
1371	cache without the driver providing functions to do so.
1372   */
1373
1374
1375   The XAA pixmap cache provides a mechanism for caching of patterns
1376   in offscreen video memory so that tiled fills and in some cases
1377   stippling can be done by blitting the source patterns from offscreen
1378   video memory. The pixmap cache also provides the mechanism for caching 
1379   of 8x8 color and mono hardware patterns.  Any unused offscreen video
1380   memory gets used for the pixmap cache and that information is 
1381   provided by the XFree86 Offscreen Memory Manager. XAA registers a 
1382   callback with the manager so that it can be informed of any changes 
1383   in the offscreen memory configuration.  The driver writer does not 
1384   need to deal with any of this since it is all automatic.  The driver 
1385   merely needs to initialize the Offscreen Memory Manager as described 
1386   in the DESIGN document and set the PIXMAP_CACHE flag in the 
1387   XAAInfoRec.Flags field.  The Offscreen Memory Manager initialization 
1388   must occur before XAA is initialized or else pixmap cache 
1389   initialization will fail.  
1390
1391   PixmapCacheFlags is an XAAInfoRec field which allows the driver to
1392   control pixmap cache behavior to some extent.  Currently only one
1393   flag is defined:
1394
1395   DO_NOT_BLIT_STIPPLES
1396
1397     This indicates that the stippling should not be done by blitting
1398     from the pixmap cache.  This does not apply to 8x8 pattern fills. 
1399
1400
1401   CachePixelGranularity is an optional field.  If the hardware requires
1402   that a 8x8 patterns have some particular pixel alignment it should
1403   be reflected in this field.  Ignoring this field or setting it to
1404   zero or one means there are no alignment issues.
1405
1406
14074)  OFFSCREEN PIXMAPS
1408
1409   XAA has the ability to store pixmap drawables in offscreen video 
1410   memory and render into them with full hardware acceleration.  Placement
1411   of pixmaps in the cache is done automatically on a first-come basis and 
1412   only if there is room.  To enable this feature, set the OFFSCREEN_PIXMAPS
1413   flag in the XAAInfoRec.Flags field.  This is only available when a
1414   ScreenToScreenCopy function is provided, when the Offscreen memory 
1415   manager has been initialized and when the LINEAR_FRAMEBUFFER flag is
1416   also set.
1417
1418   int maxOffPixWidth
1419   int maxOffPixHeight
1420
1421       These two fields allow the driver to limit the maximum dimensions
1422     of an offscreen pixmap.  If one of these is not set, it is assumed
1423     that there is no limit on that dimension.  Note that if an offscreen
1424     pixmap with a particular dimension is allowed, then your driver will be
1425     expected to render primitives as large as that pixmap.  
1426
1427$XFree86: xc/programs/Xserver/hw/xfree86/xaa/XAA.HOWTO,v 1.12 2000/04/12 14:44:42 tsi Exp $
1428