1d514b0f3Smrg- Acceleration
2d514b0f3Smrg	- Blits and solid fill
3d514b0f3Smrg
4d514b0f3Smrg  - XAA and the shadow buffer will not work together, because the
5d514b0f3Smrg    shadow buffer updates in the block handler, so if we got any XAA
6d514b0f3Smrg    calls in between, things would get messed up. 
7d514b0f3Smrg
8d514b0f3Smrg    Current plan:
9d514b0f3Smrg	- Add our own damage tracker that produces raw rectangles
10d514b0f3Smrg	- Whenever it fires, submit the copy immediately
11d514b0f3Smrg
12d514b0f3Smrg	- Wrap the necessary ops in such a way that the original
13d514b0f3Smrg	  implementation gets called first. The original implementation
14d514b0f3Smrg	  will use fb, which will produce damage, which will get
15d514b0f3Smrg	  submitted.
16d514b0f3Smrg
17d514b0f3Smrg	  If we decide to accelerate a particular operation, first set
18d514b0f3Smrg          a flag that the immediately following damage event should not
19d514b0f3Smrg	  result in spice protocol being sent. Ie., 
20d514b0f3Smrg
21d514b0f3Smrg	  on_op:
22d514b0f3Smrg		qxl->enable_copying = FALSE
23d514b0f3Smrg
24d514b0f3Smrg		call original;
25d514b0f3Smrg
26d514b0f3Smrg		send acceleration command
27d514b0f3Smrg
28d514b0f3Smrg		qxl->enable_copying = TRUE
29d514b0f3Smrg
30d514b0f3Smrg	  Note damage is added before the drawing hits the framebuffer, so
31d514b0f3Smrg	  it will have to be stored, then cleared
32d514b0f3Smrg		- in a block handler
33d514b0f3Smrg		- before accelerating
34d514b0f3Smrg
35d514b0f3Smrg	  Ie., 
36d514b0f3Smrg
37d514b0f3Smrg	  on_op:
38d514b0f3Smrg		clear damage
39d514b0f3Smrg		disable damage reporting
40d514b0f3Smrg		call original (this will generate unreported damage and
41d514b0f3Smrg			paint to the shadow)
42d514b0f3Smrg		submit command
43d514b0f3Smrg		enable damage
44d514b0f3Smrg
45d514b0f3Smrg	   It may be possible to use the shadow code if we added a
46d514b0f3Smrg	   shadowReportNow() that would report any existing
47d514b0f3Smrg	   damage. Ie., basically export shadowRedisplay()
48d514b0f3Smrg
49d514b0f3Smrg    1. Get damage added, out of CreateScreenResources
50d514b0f3Smrg    2. Make sure it works
51d514b0f3Smrg    3. Submit copies and disable shadow
52d514b0f3Smrg    4. Delete shadow
53d514b0f3Smrg    5. Wrap some of the ops, or use XAA?
54d514b0f3Smrg
55d514b0f3Smrg    The input we get is:
56d514b0f3Smrg
57d514b0f3Smrg	- First a damage notification: "I am going to draw here"
58d514b0f3Smrg	- Then maybe an exa notification
59d514b0f3Smrg
60d514b0f3Smrg	So the algorithm is. 
61d514b0f3Smrg
62d514b0f3Smrg	Maintain a "to_copy" region to be copied into the device
63d514b0f3Smrg
64d514b0f3Smrg	- in damage, if there is anything in to_copy, copy it
65d514b0f3Smrg
66d514b0f3Smrg	- in block handler, if there is anything in to_copy, copy it
67d514b0f3Smrg
68d514b0f3Smrg	- in exa, if we manage to accelerate, delete to_copy.
69d514b0f3Smrg
70d514b0f3Smrg	Unfortunately, for core text, what happens is
71d514b0f3Smrg		- damage is produced for the glyph box
72d514b0f3Smrg		- solid fill is generated
73d514b0f3Smrg		- the glyph is drawn
74d514b0f3Smrg	And the algorithm above means the damage is thrown away.
75d514b0f3Smrg
76d514b0f3Smrg- Coding style fixes
77d514b0f3Smrg
78d514b0f3Smrg- Better malloc() implementation
79d514b0f3Smrg	- Take malloc() from the windows driver?
80d514b0f3Smrg	- Put blocks in a tree?
81d514b0f3Smrg
82d514b0f3Smrg- Find out why it picks 8x6 rather than a reasonable mode
83d514b0f3Smrg - Possibly has to do with the timings it reports. RandR only
84d514b0f3Smrg   allows 8x6 and 6x4.
85d514b0f3Smrg
86d514b0f3Smrg- Only compile mmtest if glib is installed
87d514b0f3Smrg	Or maybe just get rid of mmtest.c
88d514b0f3Smrg
89d514b0f3Smrg- Notes on offscreen pixmaps
90d514b0f3Smrg
91d514b0f3Smrg  Yaniv says that PCI resources is a concern and that it would be better
92d514b0f3Smrg  if we can use guest memory instead of video memory. I guess we can
93d514b0f3Smrg  do that, given a kernel driver that can allocate pinned memory.
94d514b0f3Smrg
95d514b0f3Smrg	- If/when we add hardware acceleration to pixman, pixman will need to
96d514b0f3Smrg	  generate QXL protocol. This could be tricky because DRM assumes that
97d514b0f3Smrg	  everything is a pixmap, but qxl explicitly has a framebuffer. Same 
98d514b0f3Smrg	  goes for cairo-drm. 
99d514b0f3Smrg
100d514b0f3Smrg- Hashing
101d514b0f3Smrg
102d514b0f3Smrg  QXL has a feature where it can send hash codes for pixmaps. Unfortunately
103d514b0f3Smrg  most of the pixmaps we use are very shortlived. But there may be a benefit
104d514b0f3Smrg  for the root pixmap (and in general for the (few) windows that have
105d514b0f3Smrg  a pixmap background).
106d514b0f3Smrg
107d514b0f3Smrg  - When copying from pixmap to framebuffer, right now we just copy
108d514b0f3Smrg    the bits from the fb allocated pixmap.
109d514b0f3Smrg
110d514b0f3Smrg  - With hashing, we need to copy it to video memory, hash it, then set the
111d514b0f3Smrg    "unique" field to that hash value (plus the QXL_CACHE
112d514b0f3Smrg    flag). Presumably we'll get a normal remove on it when it is no
113d514b0f3Smrg    longer in use.
114d514b0f3Smrg
115d514b0f3Smrg  - If we know an image is available in video memory already, we should just
116d514b0f3Smrg    submit it. There is no race condition here because the image is
117d514b0f3Smrg    ultimately removed from vmem by the driver.
118d514b0f3Smrg
119d514b0f3Smrg    (Note hash value could probably just be XID plus a serial number).
120d514b0f3Smrg
121d514b0f3Smrg  - So for the proof of concept we'll be hashing complete pixmaps every time
122d514b0f3Smrg    we submit them.
123d514b0f3Smrg
124d514b0f3Smrg- Tiles
125d514b0f3Smrg
126d514b0f3Smrg  It may be beneficial to send pixmaps in smaller tiles, though Yaniv
127d514b0f3Smrg  says we will need atomic drawing to prevent tearing.
128d514b0f3Smrg
129d514b0f3Smrg- Video
130d514b0f3Smrg
131d514b0f3Smrg  We should certainly support Xv. The scaled blits should be sent
132d514b0f3Smrg  as commands, rather than as software. Does spice support YUV images?
133d514b0f3Smrg  If not, then it probably should.
134d514b0f3Smrg
135d514b0f3Smrg- Multi-monitor:
136d514b0f3Smrg
137d514b0f3Smrg  - Windows may not support more than dual-head, but we do support more than
138d514b0f3Smrg    dual-head in spice. This is why they do the multi-pci device.
139d514b0f3Smrg
140d514b0f3Smrg    Ie,. the claim is that Yaniv did not find any API that would
141d514b0f3Smrg    support more than two outputs per PCI device. (This seems dubious
142d514b0f3Smrg    given that four-head cards do exist).
143d514b0f3Smrg
144d514b0f3Smrg  - Linux multi-monitor configuration supports hotplug of monitors,
145d514b0f3Smrg    and you can't make up PCI devices from inside the driver.
146d514b0f3Smrg
147d514b0f3Smrg  - On windows the guest agent is responsible for setting the monitors
148d514b0f3Smrg    and resolutions.
149d514b0f3Smrg
150d514b0f3Smrg  - On linux we should support EDID information, and enabling and
151d514b0f3Smrg    disabling PCI devices on the fly is pretty difficult to deal with
152d514b0f3Smrg    in X. Ie., we would need working support for both GPU hotplug and
153d514b0f3Smrg    for shatter. This is just not happening in RHEL 5 or 6.
154d514b0f3Smrg
155d514b0f3Smrg  - Reading back EDID over the spice protocol would be necessary
156d514b0f3Smrg    because when you hit detect displays, that's what needs to happen.
157d514b0f3Smrg
158d514b0f3SmrgBetter acceleration:
159d514b0f3Smrg
160d514b0f3Smrg- Given offscreen pixmaps, we should get rid of the shadow framebuffer.
161d514b0f3Smrg  If we have to fall back to software, we can use the drawing area to 
162d514b0f3Smrg  get the area in question, then copy them to qxl_malloced memory,
163d514b0f3Smrg  then draw there, then finally send the bits.
164d514b0f3Smrg
165d514b0f3Smrg-=-=-=-=-
166d514b0f3Smrg
167d514b0f3SmrgDone:
168d514b0f3Smrg
169d514b0f3SmrgQuestion:
170d514b0f3Smrg
171d514b0f3Smrg- Submit cursor images
172d514b0f3Smrg
173d514b0f3Smrg- Note: when we set a mode, all allocated memory should be considered
174d514b0f3Smrg  released.
175d514b0f3Smrg
176d514b0f3Smrg- What is the "vram" PCI range used for? 
177d514b0f3Smrg
178d514b0f3Smrg  As I read the Windows driver, it can be mapped with the ioctl
179d514b0f3Smrg  VIDEO_MAP_VIDEO_MEMORY. In driver.c it is mapped as pdev->fb, but
180d514b0f3Smrg  it is then never used for anything as far as I can tell.
181d514b0f3Smrg
182d514b0f3Smrg  Does Windows itself use that ioctl, and if so, for what. The area
183d514b0f3Smrg  is only 32K in size so it can't really be used for any realistic
184d514b0f3Smrg  bitmaps.
185d514b0f3Smrg
186d514b0f3Smrg    It's a required ioctl.  I believe it's needed for DGA-like things.
187d514b0f3Smrg    I have no idea how the Windows driver manages syncing for that,
188d514b0f3Smrg    but I think we can safely ignore it. [ajax]
189d514b0f3Smrg
190d514b0f3Smrg- Hook up randr if it isn't already
191d514b0f3Smrg
192d514b0f3Smrg- Garbage collection
193d514b0f3Smrg	- Before every allocation?
194d514b0f3Smrg	- When we run out of memory?
195d514b0f3Smrg	- Whenever we overflow some fixed pool?
196d514b0f3Smrg
197d514b0f3Smrg- Get rid of qxl_mem.h header; just use qxl.h
198d514b0f3Smrg
199d514b0f3Smrg- Split out ring code into qxl_ring.c
200d514b0f3Smrg
201d514b0f3Smrg- Don't keep the maps around that are just used in preinit
202d514b0f3Smrg	(Is there any real reason to not just do the CheckDevice in
203d514b0f3Smrg	 ScreenInit?)
204d514b0f3Smrg
205d514b0f3Smrg
206