1d514b0f3Smrg- Acceleration 2d514b0f3Smrg - Blits and solid fill 3d514b0f3Smrg 4d514b0f3Smrg - XAA and the shadow buffer will not work together, because the 5d514b0f3Smrg shadow buffer updates in the block handler, so if we got any XAA 6d514b0f3Smrg calls in between, things would get messed up. 7d514b0f3Smrg 8d514b0f3Smrg Current plan: 9d514b0f3Smrg - Add our own damage tracker that produces raw rectangles 10d514b0f3Smrg - Whenever it fires, submit the copy immediately 11d514b0f3Smrg 12d514b0f3Smrg - Wrap the necessary ops in such a way that the original 13d514b0f3Smrg implementation gets called first. The original implementation 14d514b0f3Smrg will use fb, which will produce damage, which will get 15d514b0f3Smrg submitted. 16d514b0f3Smrg 17d514b0f3Smrg If we decide to accelerate a particular operation, first set 18d514b0f3Smrg a flag that the immediately following damage event should not 19d514b0f3Smrg result in spice protocol being sent. Ie., 20d514b0f3Smrg 21d514b0f3Smrg on_op: 22d514b0f3Smrg qxl->enable_copying = FALSE 23d514b0f3Smrg 24d514b0f3Smrg call original; 25d514b0f3Smrg 26d514b0f3Smrg send acceleration command 27d514b0f3Smrg 28d514b0f3Smrg qxl->enable_copying = TRUE 29d514b0f3Smrg 30d514b0f3Smrg Note damage is added before the drawing hits the framebuffer, so 31d514b0f3Smrg it will have to be stored, then cleared 32d514b0f3Smrg - in a block handler 33d514b0f3Smrg - before accelerating 34d514b0f3Smrg 35d514b0f3Smrg Ie., 36d514b0f3Smrg 37d514b0f3Smrg on_op: 38d514b0f3Smrg clear damage 39d514b0f3Smrg disable damage reporting 40d514b0f3Smrg call original (this will generate unreported damage and 41d514b0f3Smrg paint to the shadow) 42d514b0f3Smrg submit command 43d514b0f3Smrg enable damage 44d514b0f3Smrg 45d514b0f3Smrg It may be possible to use the shadow code if we added a 46d514b0f3Smrg shadowReportNow() that would report any existing 47d514b0f3Smrg damage. Ie., basically export shadowRedisplay() 48d514b0f3Smrg 49d514b0f3Smrg 1. Get damage added, out of CreateScreenResources 50d514b0f3Smrg 2. Make sure it works 51d514b0f3Smrg 3. Submit copies and disable shadow 52d514b0f3Smrg 4. Delete shadow 53d514b0f3Smrg 5. Wrap some of the ops, or use XAA? 54d514b0f3Smrg 55d514b0f3Smrg The input we get is: 56d514b0f3Smrg 57d514b0f3Smrg - First a damage notification: "I am going to draw here" 58d514b0f3Smrg - Then maybe an exa notification 59d514b0f3Smrg 60d514b0f3Smrg So the algorithm is. 61d514b0f3Smrg 62d514b0f3Smrg Maintain a "to_copy" region to be copied into the device 63d514b0f3Smrg 64d514b0f3Smrg - in damage, if there is anything in to_copy, copy it 65d514b0f3Smrg 66d514b0f3Smrg - in block handler, if there is anything in to_copy, copy it 67d514b0f3Smrg 68d514b0f3Smrg - in exa, if we manage to accelerate, delete to_copy. 69d514b0f3Smrg 70d514b0f3Smrg Unfortunately, for core text, what happens is 71d514b0f3Smrg - damage is produced for the glyph box 72d514b0f3Smrg - solid fill is generated 73d514b0f3Smrg - the glyph is drawn 74d514b0f3Smrg And the algorithm above means the damage is thrown away. 75d514b0f3Smrg 76d514b0f3Smrg- Coding style fixes 77d514b0f3Smrg 78d514b0f3Smrg- Better malloc() implementation 79d514b0f3Smrg - Take malloc() from the windows driver? 80d514b0f3Smrg - Put blocks in a tree? 81d514b0f3Smrg 82d514b0f3Smrg- Find out why it picks 8x6 rather than a reasonable mode 83d514b0f3Smrg - Possibly has to do with the timings it reports. RandR only 84d514b0f3Smrg allows 8x6 and 6x4. 85d514b0f3Smrg 86d514b0f3Smrg- Only compile mmtest if glib is installed 87d514b0f3Smrg Or maybe just get rid of mmtest.c 88d514b0f3Smrg 89d514b0f3Smrg- Notes on offscreen pixmaps 90d514b0f3Smrg 91d514b0f3Smrg Yaniv says that PCI resources is a concern and that it would be better 92d514b0f3Smrg if we can use guest memory instead of video memory. I guess we can 93d514b0f3Smrg do that, given a kernel driver that can allocate pinned memory. 94d514b0f3Smrg 95d514b0f3Smrg - If/when we add hardware acceleration to pixman, pixman will need to 96d514b0f3Smrg generate QXL protocol. This could be tricky because DRM assumes that 97d514b0f3Smrg everything is a pixmap, but qxl explicitly has a framebuffer. Same 98d514b0f3Smrg goes for cairo-drm. 99d514b0f3Smrg 100d514b0f3Smrg- Hashing 101d514b0f3Smrg 102d514b0f3Smrg QXL has a feature where it can send hash codes for pixmaps. Unfortunately 103d514b0f3Smrg most of the pixmaps we use are very shortlived. But there may be a benefit 104d514b0f3Smrg for the root pixmap (and in general for the (few) windows that have 105d514b0f3Smrg a pixmap background). 106d514b0f3Smrg 107d514b0f3Smrg - When copying from pixmap to framebuffer, right now we just copy 108d514b0f3Smrg the bits from the fb allocated pixmap. 109d514b0f3Smrg 110d514b0f3Smrg - With hashing, we need to copy it to video memory, hash it, then set the 111d514b0f3Smrg "unique" field to that hash value (plus the QXL_CACHE 112d514b0f3Smrg flag). Presumably we'll get a normal remove on it when it is no 113d514b0f3Smrg longer in use. 114d514b0f3Smrg 115d514b0f3Smrg - If we know an image is available in video memory already, we should just 116d514b0f3Smrg submit it. There is no race condition here because the image is 117d514b0f3Smrg ultimately removed from vmem by the driver. 118d514b0f3Smrg 119d514b0f3Smrg (Note hash value could probably just be XID plus a serial number). 120d514b0f3Smrg 121d514b0f3Smrg - So for the proof of concept we'll be hashing complete pixmaps every time 122d514b0f3Smrg we submit them. 123d514b0f3Smrg 124d514b0f3Smrg- Tiles 125d514b0f3Smrg 126d514b0f3Smrg It may be beneficial to send pixmaps in smaller tiles, though Yaniv 127d514b0f3Smrg says we will need atomic drawing to prevent tearing. 128d514b0f3Smrg 129d514b0f3Smrg- Video 130d514b0f3Smrg 131d514b0f3Smrg We should certainly support Xv. The scaled blits should be sent 132d514b0f3Smrg as commands, rather than as software. Does spice support YUV images? 133d514b0f3Smrg If not, then it probably should. 134d514b0f3Smrg 135d514b0f3Smrg- Multi-monitor: 136d514b0f3Smrg 137d514b0f3Smrg - Windows may not support more than dual-head, but we do support more than 138d514b0f3Smrg dual-head in spice. This is why they do the multi-pci device. 139d514b0f3Smrg 140d514b0f3Smrg Ie,. the claim is that Yaniv did not find any API that would 141d514b0f3Smrg support more than two outputs per PCI device. (This seems dubious 142d514b0f3Smrg given that four-head cards do exist). 143d514b0f3Smrg 144d514b0f3Smrg - Linux multi-monitor configuration supports hotplug of monitors, 145d514b0f3Smrg and you can't make up PCI devices from inside the driver. 146d514b0f3Smrg 147d514b0f3Smrg - On windows the guest agent is responsible for setting the monitors 148d514b0f3Smrg and resolutions. 149d514b0f3Smrg 150d514b0f3Smrg - On linux we should support EDID information, and enabling and 151d514b0f3Smrg disabling PCI devices on the fly is pretty difficult to deal with 152d514b0f3Smrg in X. Ie., we would need working support for both GPU hotplug and 153d514b0f3Smrg for shatter. This is just not happening in RHEL 5 or 6. 154d514b0f3Smrg 155d514b0f3Smrg - Reading back EDID over the spice protocol would be necessary 156d514b0f3Smrg because when you hit detect displays, that's what needs to happen. 157d514b0f3Smrg 158d514b0f3SmrgBetter acceleration: 159d514b0f3Smrg 160d514b0f3Smrg- Given offscreen pixmaps, we should get rid of the shadow framebuffer. 161d514b0f3Smrg If we have to fall back to software, we can use the drawing area to 162d514b0f3Smrg get the area in question, then copy them to qxl_malloced memory, 163d514b0f3Smrg then draw there, then finally send the bits. 164d514b0f3Smrg 165d514b0f3Smrg-=-=-=-=- 166d514b0f3Smrg 167d514b0f3SmrgDone: 168d514b0f3Smrg 169d514b0f3SmrgQuestion: 170d514b0f3Smrg 171d514b0f3Smrg- Submit cursor images 172d514b0f3Smrg 173d514b0f3Smrg- Note: when we set a mode, all allocated memory should be considered 174d514b0f3Smrg released. 175d514b0f3Smrg 176d514b0f3Smrg- What is the "vram" PCI range used for? 177d514b0f3Smrg 178d514b0f3Smrg As I read the Windows driver, it can be mapped with the ioctl 179d514b0f3Smrg VIDEO_MAP_VIDEO_MEMORY. In driver.c it is mapped as pdev->fb, but 180d514b0f3Smrg it is then never used for anything as far as I can tell. 181d514b0f3Smrg 182d514b0f3Smrg Does Windows itself use that ioctl, and if so, for what. The area 183d514b0f3Smrg is only 32K in size so it can't really be used for any realistic 184d514b0f3Smrg bitmaps. 185d514b0f3Smrg 186d514b0f3Smrg It's a required ioctl. I believe it's needed for DGA-like things. 187d514b0f3Smrg I have no idea how the Windows driver manages syncing for that, 188d514b0f3Smrg but I think we can safely ignore it. [ajax] 189d514b0f3Smrg 190d514b0f3Smrg- Hook up randr if it isn't already 191d514b0f3Smrg 192d514b0f3Smrg- Garbage collection 193d514b0f3Smrg - Before every allocation? 194d514b0f3Smrg - When we run out of memory? 195d514b0f3Smrg - Whenever we overflow some fixed pool? 196d514b0f3Smrg 197d514b0f3Smrg- Get rid of qxl_mem.h header; just use qxl.h 198d514b0f3Smrg 199d514b0f3Smrg- Split out ring code into qxl_ring.c 200d514b0f3Smrg 201d514b0f3Smrg- Don't keep the maps around that are just used in preinit 202d514b0f3Smrg (Is there any real reason to not just do the CheckDevice in 203d514b0f3Smrg ScreenInit?) 204d514b0f3Smrg 205d514b0f3Smrg 206