103b705cfSriastradhSandyBridge's New Acceleration 203b705cfSriastradh------------------------------ 303b705cfSriastradh 403b705cfSriastradhThe guiding principle behind the design is to avoid GPU context switches. 503b705cfSriastradhOn SandyBridge (and beyond), these are especially pernicious because the 603b705cfSriastradhRENDER and BLT engine are now on different rings and require 703b705cfSriastradhsynchronisation of the various execution units when switching contexts. 803b705cfSriastradhThey were not cheap on early generation, but with the increasing 903b705cfSriastradhcomplexity of the GPU, avoiding such serialisations is important. 1003b705cfSriastradh 1103b705cfSriastradhFurthermore, we try very hard to avoid migrating between the CPU and GPU. 1203b705cfSriastradhEvery pixmap (apart from temporary "scratch" surfaces which we intend to 1303b705cfSriastradhuse on the GPU) is created in system memory. All operations are then done 1403b705cfSriastradhupon this shadow copy until we are forced to move it onto the GPU. Such 1503b705cfSriastradhmigration can only be first triggered by: setting the pixmap as the 1603b705cfSriastradhscanout (we obviously need a GPU buffer here), using the pixmap as a DRI 1703b705cfSriastradhbuffer (the client expects to perform hardware acceleration and we do not 1803b705cfSriastradhwant to disappoint) and lastly using the pixmap as a RENDER target. This 1903b705cfSriastradhlast is chosen because when we know we are going to perform hardware 2003b705cfSriastradhacceleration and will continue to do so without fallbacks, using the GPU 2103b705cfSriastradhis much, much faster than the CPU. The heuristic I chose therefore was 2203b705cfSriastradhthat if the application uses RENDER, i.e. cairo, then it will only be 2303b705cfSriastradhusing those paths and not intermixing core drawing operations and so 2403b705cfSriastradhunlikely to trigger a fallback. 2503b705cfSriastradh 2603b705cfSriastradhThe complicating case is front-buffer rendering. So in order to accommodate 2703b705cfSriastradhusing RENDER on an application whilst running xterm without a composite 2803b705cfSriastradhmanager redirecting all the pixmaps to backing surfaces, we have to 2903b705cfSriastradhperform damage tracking to avoid excess migration of portions of the 3003b705cfSriastradhbuffer. 31