1428d7b3dSmrgSandyBridge's New Acceleration 2428d7b3dSmrg------------------------------ 3428d7b3dSmrg 4428d7b3dSmrgThe guiding principle behind the design is to avoid GPU context switches. 5428d7b3dSmrgOn SandyBridge (and beyond), these are especially pernicious because the 6428d7b3dSmrgRENDER and BLT engine are now on different rings and require 7428d7b3dSmrgsynchronisation of the various execution units when switching contexts. 8428d7b3dSmrgThey were not cheap on early generation, but with the increasing 9428d7b3dSmrgcomplexity of the GPU, avoiding such serialisations is important. 10428d7b3dSmrg 11428d7b3dSmrgFurthermore, we try very hard to avoid migrating between the CPU and GPU. 12428d7b3dSmrgEvery pixmap (apart from temporary "scratch" surfaces which we intend to 13428d7b3dSmrguse on the GPU) is created in system memory. All operations are then done 14428d7b3dSmrgupon this shadow copy until we are forced to move it onto the GPU. Such 15428d7b3dSmrgmigration can only be first triggered by: setting the pixmap as the 16428d7b3dSmrgscanout (we obviously need a GPU buffer here), using the pixmap as a DRI 17428d7b3dSmrgbuffer (the client expects to perform hardware acceleration and we do not 18428d7b3dSmrgwant to disappoint) and lastly using the pixmap as a RENDER target. This 19428d7b3dSmrglast is chosen because when we know we are going to perform hardware 20428d7b3dSmrgacceleration and will continue to do so without fallbacks, using the GPU 21428d7b3dSmrgis much, much faster than the CPU. The heuristic I chose therefore was 22428d7b3dSmrgthat if the application uses RENDER, i.e. cairo, then it will only be 23428d7b3dSmrgusing those paths and not intermixing core drawing operations and so 24428d7b3dSmrgunlikely to trigger a fallback. 25428d7b3dSmrg 26428d7b3dSmrgThe complicating case is front-buffer rendering. So in order to accommodate 27428d7b3dSmrgusing RENDER on an application whilst running xterm without a composite 28428d7b3dSmrgmanager redirecting all the pixmaps to backing surfaces, we have to 29428d7b3dSmrgperform damage tracking to avoid excess migration of portions of the 30428d7b3dSmrgbuffer. 31