103b705cfSriastradhSandyBridge's New Acceleration
203b705cfSriastradh------------------------------
303b705cfSriastradh
403b705cfSriastradhThe guiding principle behind the design is to avoid GPU context switches.
503b705cfSriastradhOn SandyBridge (and beyond), these are especially pernicious because the
603b705cfSriastradhRENDER and BLT engine are now on different rings and require
703b705cfSriastradhsynchronisation of the various execution units when switching contexts.
803b705cfSriastradhThey were not cheap on early generation, but with the increasing
903b705cfSriastradhcomplexity of the GPU, avoiding such serialisations is important.
1003b705cfSriastradh
1103b705cfSriastradhFurthermore, we try very hard to avoid migrating between the CPU and GPU.
1203b705cfSriastradhEvery pixmap (apart from temporary "scratch" surfaces which we intend to
1303b705cfSriastradhuse on the GPU) is created in system memory. All operations are then done
1403b705cfSriastradhupon this shadow copy until we are forced to move it onto the GPU. Such
1503b705cfSriastradhmigration can only be first triggered by: setting the pixmap as the
1603b705cfSriastradhscanout (we obviously need a GPU buffer here), using the pixmap as a DRI
1703b705cfSriastradhbuffer (the client expects to perform hardware acceleration and we do not
1803b705cfSriastradhwant to disappoint) and lastly using the pixmap as a RENDER target. This
1903b705cfSriastradhlast is chosen because when we know we are going to perform hardware
2003b705cfSriastradhacceleration and will continue to do so without fallbacks, using the GPU
2103b705cfSriastradhis much, much faster than the CPU. The heuristic I chose therefore was
2203b705cfSriastradhthat if the application uses RENDER, i.e. cairo, then it will only be
2303b705cfSriastradhusing those paths and not intermixing core drawing operations and so
2403b705cfSriastradhunlikely to trigger a fallback.
2503b705cfSriastradh
2603b705cfSriastradhThe complicating case is front-buffer rendering. So in order to accommodate
2703b705cfSriastradhusing RENDER on an application whilst running xterm without a composite
2803b705cfSriastradhmanager redirecting all the pixmaps to backing surfaces, we have to
2903b705cfSriastradhperform damage tracking to avoid excess migration of portions of the
3003b705cfSriastradhbuffer.
31