1d983712dSmrg 2d983712dSmrg 3d983712dSmrgThis file is NOT up to date for the New Design! 4d983712dSmrg 5d983712dSmrg 6d983712dSmrg 7d983712dSmrg 8d983712dSmrg============== old (pre-ND) contents below ============== 9d983712dSmrg 10d983712dSmrg"I just thought it would be usefull if we had some kind of TODO and BUGS 11d983712dSmrgfiles in the distribution as it would make it easier to see what is needed 12d983712dSmrgto be done and what could be done better, instead of browsing through the 13d983712dSmrgsourcecode. And we whould be able to se the progress literally by the ever 14d983712dSmrgdecreasing TODO file :-)" 15d983712dSmrg 16d983712dSmrg 17d983712dSmrg## BUGS: 18d983712dSmrg 19d983712dSmrgAll Tseng cards: 20d983712dSmrg 21d983712dSmrg* We definitely NEED to fix that color-expansion problem. See Appendix A 22d983712dSmrgbelow for a detailed explanation. 23d983712dSmrg 24d983712dSmrg* There are still some problems with the HW-cursor. The error message about 25d983712dSmrg"wrong color selected" is disabled, and the limitation documented. Better 26d983712dSmrgwould be to have a way to dynamically switch to software-cursor mode if the 27d983712dSmrgcolor can not be made. HW cursor doesn't work in DoubleScan modes yet (only 28d983712dSmrghalf of the cursor displayed) 29d983712dSmrg 30d983712dSmrg* text font sometimes corrupted when going back to text mode. This may be 31d983712dSmrgrelated to the order in which registers are restored: the ARK driver first 32d983712dSmrgrestores extended registers before restoring the standard registers for 33d983712dSmrgexcactly this reason. 34d983712dSmrg 35d983712dSmrg* The code needs to be heavily reworked to fix all sorts of data type 36d983712dSmrgproblems. The current code will certainly not run on an Alpha. The first 37d983712dSmrgstep is to replace all hardware related variables by CARD8/CARD16/CARD32 38d983712dSmrgtypes. 39d983712dSmrg 40d983712dSmrg 41d983712dSmrgET6000: 42d983712dSmrg 43d983712dSmrg* The trapezoid code is disabled because it doesn't comply with the way the 44d983712dSmrgnon-accelerated ("cfb") code does things. This needs to be fixed. 45d983712dSmrg 46d983712dSmrg 47d983712dSmrgET-4000(W32): 48d983712dSmrg 49d983712dSmrg* Hardware cursor support for the W32 is still lacking color support. We 50d983712dSmrgneed to reserve color cells #0 and #255 to make this work. From discussions 51d983712dSmrgon the development list, it seems the best solution is to allocate these cells 52d983712dSmrgread-write, and then use them for the HW cursor. We MUST however document 53d983712dSmrgthat this will break some clients which depend on a fixed color in cell #0, 54d983712dSmrgand some others that rely on the presence of 256 color cells. It will also 55d983712dSmrgcause cursor color problems when someone uses a local color map. 56d983712dSmrg 57d983712dSmrg 58d983712dSmrg## TODO: 59d983712dSmrg 60d983712dSmrgAll cards: 61d983712dSmrg 62d983712dSmrg* The accelerator on the Tseng devices is capable of much more. Especially 63d983712dSmrgthe pattern support is not used most of the time: It can render a pattern in 64d983712dSmrgjust about every accelerated operation. This means patterned lines, bitblts, 65d983712dSmrgscreencopies, etc. are possible. However, operations like these are very 66d983712dSmrguncommon in normal server use, so the speed benefit would go largely unnoticed. 67d983712dSmrg 68d983712dSmrg 69d983712dSmrgET4000: 70d983712dSmrg 71d983712dSmrg* support needs to be added for several clockchips and RAMDACs: 72d983712dSmrg - 8-bit RAMDAC support for >8bpp modes: Sierra DACs and possibly others 73d983712dSmrg - AT&T 20C49x RAMDAC support is not correct. 74d983712dSmrg 75d983712dSmrg* SuperProbe could use an update. It doesn't detect some of the RAMDACs that 76d983712dSmrgare detected by the driver. 77d983712dSmrg 78d983712dSmrg* Several of the color expansion-related accelerations are still only 8bpp. 79d983712dSmrgIt should be easy to use the same trick on those as on the standard color 80d983712dSmrgexpand code (use intermediate buffer, expand data before blitting). 81d983712dSmrg 82d983712dSmrg* many of the operations that the W32 family can't support natively (e.g. 83d983712dSmrgFillRectSolid for 24bpp) can be performed using CPU-to-screen operations, 84d983712dSmrgfeeding the correct (color) information through the ACL aperture. 85d983712dSmrg 86d983712dSmrg 87d983712dSmrgET6000: 88d983712dSmrg 89d983712dSmrg* someone might want to look at how the bitBLT engine of the ET6000 is 90d983712dSmrgconstructed, and come up with some fancy ways of abusing it. We're still 91d983712dSmrgonly using a small part of it (I'm thinking about the compare map and the 92d983712dSmrgextensions to the MIX hardware compared to the ET4000). 93d983712dSmrg 94d983712dSmrg* Mclk support is still lacking (that would also allow MClk-dependent 95d983712dSmrgmaximum bandwidth). 96d983712dSmrg 97d983712dSmrg* Apart from the things mentionned above, I think the ET6000 server is 98d983712dSmrgpretty complete. Some optimisations could possibly be added. Like for 99d983712dSmrgexample some assembler code for calculating a framebuffer address from X/Y 100d983712dSmrgcoordinates. That would help to speed up small blits. 101d983712dSmrg 102d983712dSmrg 103d983712dSmrg======================================================================= 104d983712dSmrgAPPENDIX A: the color expansion problem 105d983712dSmrg---------------------------------------- 106d983712dSmrg 107d983712dSmrgAs suggested in the data book, we're doing font rendering using the 108d983712dSmrgcolor-expansion (MIX map) capabilities of the Tseng accelerator. 109d983712dSmrg 110d983712dSmrgWe're using a ping-pong buffer scheme (triple buffering actually) in 111d983712dSmrgoff-screen memory to store one scanline worth of font data at a time. each 112d983712dSmrgof these scanlines is "blitted" to on-screen memory using the accelerator. 113d983712dSmrgThe scanline is the MIX map, and there's also a 4x1 solid foreground color 114d983712dSmrg(SRC map), and a 4x1 solid background color (PAT map). 115d983712dSmrg 116d983712dSmrgBasically, the flow is as follows: 117d983712dSmrg 118d983712dSmrg - setup accelerator for font-expansion 119d983712dSmrg 120d983712dSmrg - store scanline 1 in off-screen memory buffer 1 121d983712dSmrg 122d983712dSmrg - start operation 123d983712dSmrg 124d983712dSmrg - store scanline 2 in off-screen memory buffer 2 125d983712dSmrg 126d983712dSmrg - start operation 127d983712dSmrg 128d983712dSmrg - store scanline 3 in off-screen memory buffer 3 129d983712dSmrg 130d983712dSmrg - start operation 131d983712dSmrg 132d983712dSmrg - store scanline 4 in off-screen memory buffer 1 133d983712dSmrg 134d983712dSmrg - start operation 135d983712dSmrg 136d983712dSmrg ... etc, until the whole line of text is drawn. 137d983712dSmrg 138d983712dSmrgThere is no explicit "waiting" for the accelerator to finish an operation 139d983712dSmrgbefore starting a new one, because it has been set up to add "wait-states" 140d983712dSmrgwhen the queue is full. We're aiming to use concurrency between the 141d983712dSmrgaccelerator and the storing of scanlines in the buffers. Anyway, waiting 142d983712dSmrgafter each operation doesn't help. 143d983712dSmrg 144d983712dSmrgNow, in 99% of all cases, text is rendered OK. But in some cases, we're 145d983712dSmrgseeing severe font corruption. 146d983712dSmrg 147d983712dSmrgWhat we're seeing is this: sometimes, exactly 32 pixels of a scanline are 148d983712dSmrgrendered with the scanline data that was there BEFORE, instead of the one 149d983712dSmrgthat was just written into the scanline buffer. In other words, 32 pixels of 150d983712dSmrgline 2 (for example) are rendered at line 5. The rest of the scanline can be 151d983712dSmrgOK (i.e. data from scanline 5 is actually written there). 152d983712dSmrg 153d983712dSmrgHere's an attempt at showing you what _should_ have been rendered: 154d983712dSmrg 155d983712dSmrg1 156d983712dSmrg2 ##################################################################### 157d983712dSmrg3 158d983712dSmrg4 159d983712dSmrg5 160d983712dSmrg6 ##################################################################### 161d983712dSmrg7 162d983712dSmrg8 163d983712dSmrg9 164d983712dSmrg10 ##################################################################### 165d983712dSmrg11 166d983712dSmrg12 167d983712dSmrg13 168d983712dSmrg14 ##################################################################### 169d983712dSmrg15 170d983712dSmrg 171d983712dSmrg 172d983712dSmrg 173d983712dSmrgand what _is_ rendered sometimes (only an example): 174d983712dSmrg 175d983712dSmrg1 176d983712dSmrg2 ##################################################################### 177d983712dSmrg3 178d983712dSmrg4 179d983712dSmrg5 180d983712dSmrg6 ######################## ############# 181d983712dSmrg7 182d983712dSmrg8 183d983712dSmrg9 184d983712dSmrg10 ##################################################################### 185d983712dSmrg11 186d983712dSmrg12 187d983712dSmrg13 ######################## 188d983712dSmrg14 ##################################################################### 189d983712dSmrg15 190d983712dSmrg 191d983712dSmrgAt line 6, 32 pixels of the "black" scanline data from line 3 is rendered 192d983712dSmrginstead of the actual full-white that would normally have to be there. At 193d983712dSmrgline 13, the opposite happened (data from line 10 rendered at line 13). This 194d983712dSmrg32-pixel width of the "bug" is independent of the color depth: we're seeing 195d983712dSmrgthis at 8bpp as well as at 16bpp, 24bpp and 32bpp. 32 pixels each time. 196d983712dSmrg 197d983712dSmrgRemember, we're talking triple-buffering here, so the "wrongly" rendered 198d983712dSmrgdata is in fact the data that was in the scanline-buffer from the PREVIOUS 199d983712dSmrgoperation that used that buffer. 200d983712dSmrg 201d983712dSmrgIn fact, my best explanation is that sometimes, a whole DWORD (32 bits) of 202d983712dSmrgdata isn't in the video memory yet by the time the accelerator starts 203d983712dSmrgrendering with it. 204d983712dSmrg 205d983712dSmrgBut the data _is_ being written to there by the driver software, because if 206d983712dSmrgyou restart the scanline-operation again, without writing any more data to 207d983712dSmrgthe scanline buffers (only the MIX address and the destination address are 208d983712dSmrgreprogrammed to restart the scanline color expansion operation -- see code 209d983712dSmrgin tseng_acl.c), data _is_ rendered correctly. 210d983712dSmrg 211d983712dSmrg 212d983712dSmrg 213d983712dSmrgI have investigated this as far as I possibly can. I checked if the data was 214d983712dSmrgactually written in video memory. It was. I checked all kinds of PCI-related 215d983712dSmrgthings, like write-gathering or write-reordering of the PCI chipset, etc. I 216d983712dSmrgdisabled all possible enhanced features, both on the PCI chipset, inside the 217d983712dSmrgCPU, and on the ET6000. 218d983712dSmrg 219d983712dSmrgWhat strikes me, is that the exact same problems are seen on ET4000W32p as 220d983712dSmrgon the ET6000. This immediately rules out any special features that were 221d983712dSmrgonly added with the ET6000, like problems with the MDRAM cache buffers, etc. 222d983712dSmrgIt seems to be a generic problem to all Tseng accelerators. 223d983712dSmrg 224d983712dSmrgThe exact same higher-level code is being used for other chipsets as well 225d983712dSmrg(i.e. the system of writing scanlines of data to off-screen memory and 226d983712dSmrgmaking the accelerator expand it into on-screen memory), and there are no 227d983712dSmrgproblems on these other chipsets. The acceleration architecture we're using 228d983712dSmrgis completely device-independent up to the point where each chip needs to 229d983712dSmrgprovide a 230d983712dSmrg 231d983712dSmrg SetupForScanlineScreenToScreenColorExpand() 232d983712dSmrg 233d983712dSmrgand a 234d983712dSmrg 235d983712dSmrg SubsequentScanlineScreenToScreenColorExpand() 236d983712dSmrgfunction. 237d983712dSmrg 238d983712dSmrgSince the higher-level code is being used by other chip drivers as well, it 239d983712dSmrgseems to be OK. 240d983712dSmrg 241d983712dSmrgSo the problem is either in those device-dependent functions, or in the 242d983712dSmrghardware itself. 243d983712dSmrg 244d983712dSmrg 245d983712dSmrgI have found one kludge to work around this problem, and it should (?) tell 246d983712dSmrgyou a lot about the problem: if I start each scanline-colorexpand operation 247d983712dSmrgTWICE, rendering is suddenly perfect (at least there are so little rendering 248d983712dSmrgerrors that I haven't seen any yet). 249d983712dSmrg 250d983712dSmrg 251d983712dSmrgI am including the two device-depending functions so that you may be able to 252d983712dSmrgfollow what I'm saying here: 253d983712dSmrg 254d983712dSmrg 255d983712dSmrg 256d983712dSmrgOne entire line of text is drawn by calling the Setup() function ONCE. All 257d983712dSmrgscanlines of text (16 of them in case of a 8x16 font) are drawn by filling 258d983712dSmrgthe off-screen scanline buffers and calling the Subsequent() function. 259d983712dSmrg 260d983712dSmrg 261d983712dSmrg 262d983712dSmrg 263d983712dSmrg 264d983712dSmrg$XFree86: xc/programs/Xserver/hw/xfree86/drivers/tseng/README,v 1.12 2000/08/08 08:58:06 eich Exp $ 265