17ec681f3SmrgProfiling 27ec681f3Smrg========= 37ec681f3Smrg 47ec681f3SmrgOpenSWR contains built-in profiling which can be enabled 57ec681f3Smrgat build time to provide insight into performance tuning. 67ec681f3Smrg 77ec681f3SmrgTo enable this, uncomment the following line in ``rasterizer/core/knobs.h`` and rebuild: :: 87ec681f3Smrg 97ec681f3Smrg //#define KNOB_ENABLE_RDTSC 107ec681f3Smrg 117ec681f3SmrgRunning an application will result in a ``rdtsc.txt`` file being 127ec681f3Smrgcreated in current working directory. This file contains profile 137ec681f3Smrginformation captured between the ``KNOB_BUCKETS_START_FRAME`` and 147ec681f3Smrg``KNOB_BUCKETS_END_FRAME`` (see knobs section). 157ec681f3Smrg 167ec681f3SmrgThe resulting file will contain sections for each thread with a 177ec681f3Smrghierarchical breakdown of the time spent in the various operations. 187ec681f3SmrgFor example: :: 197ec681f3Smrg 207ec681f3Smrg Thread 0 (API) 217ec681f3Smrg %Tot %Par Cycles CPE NumEvent CPE2 NumEvent2 Bucket 227ec681f3Smrg 0.00 0.00 28370 2837 10 0 0 APIClearRenderTarget 237ec681f3Smrg 0.00 41.23 11698 1169 10 0 0 |-> APIDrawWakeAllThreads 247ec681f3Smrg 0.00 18.34 5202 520 10 0 0 |-> APIGetDrawContext 257ec681f3Smrg 98.72 98.72 12413773688 29957 414380 0 0 APIDraw 267ec681f3Smrg 0.36 0.36 44689364 107 414380 0 0 |-> APIDrawWakeAllThreads 277ec681f3Smrg 96.36 97.62 12117951562 9747 1243140 0 0 |-> APIGetDrawContext 287ec681f3Smrg 0.00 0.00 19904 995 20 0 0 APIStoreTiles 297ec681f3Smrg 0.00 7.88 1568 78 20 0 0 |-> APIDrawWakeAllThreads 307ec681f3Smrg 0.00 25.28 5032 251 20 0 0 |-> APIGetDrawContext 317ec681f3Smrg 1.28 1.28 161344902 64 2486370 0 0 APIGetDrawContext 327ec681f3Smrg 0.00 0.00 50368 2518 20 0 0 APISync 337ec681f3Smrg 0.00 2.70 1360 68 20 0 0 |-> APIDrawWakeAllThreads 347ec681f3Smrg 0.00 65.27 32876 1643 20 0 0 |-> APIGetDrawContext 357ec681f3Smrg 367ec681f3Smrg 377ec681f3Smrg Thread 1 (WORKER) 387ec681f3Smrg %Tot %Par Cycles CPE NumEvent CPE2 NumEvent2 Bucket 397ec681f3Smrg 83.92 83.92 13198987522 96411 136902 0 0 FEProcessDraw 407ec681f3Smrg 24.91 29.69 3918184840 167 23410158 0 0 |-> FEFetchShader 417ec681f3Smrg 11.17 13.31 1756972646 75 23410158 0 0 |-> FEVertexShader 427ec681f3Smrg 8.89 10.59 1397902996 59 23410161 0 0 |-> FEPAAssemble 437ec681f3Smrg 19.06 22.71 2997794710 384 7803387 0 0 |-> FEClipTriangles 447ec681f3Smrg 11.67 61.21 1834958176 235 7803387 0 0 |-> FEBinTriangles 457ec681f3Smrg 0.00 0.00 0 0 187258 0 0 |-> FECullZeroAreaAndBackface 467ec681f3Smrg 0.00 0.00 0 0 60051033 0 0 |-> FECullBetweenCenters 477ec681f3Smrg 0.11 0.11 17217556 2869592 6 0 0 FEProcessStoreTiles 487ec681f3Smrg 15.97 15.97 2511392576 73665 34092 0 0 WorkerWorkOnFifoBE 497ec681f3Smrg 14.04 87.95 2208687340 9187 240408 0 0 |-> WorkerFoundWork 507ec681f3Smrg 0.06 0.43 9390536 13263 708 0 0 |-> BELoadTiles 517ec681f3Smrg 0.00 0.01 293020 182 1609 0 0 |-> BEClear 527ec681f3Smrg 12.63 89.94 1986508990 949 2093014 0 0 |-> BERasterizeTriangle 537ec681f3Smrg 2.37 18.75 372374596 177 2093014 0 0 |-> BETriangleSetup 547ec681f3Smrg 0.42 3.35 66539016 31 2093014 0 0 |-> BEStepSetup 557ec681f3Smrg 0.00 0.00 0 0 21766 0 0 |-> BETrivialReject 567ec681f3Smrg 1.05 8.33 165410662 79 2071248 0 0 |-> BERasterizePartial 577ec681f3Smrg 6.06 48.02 953847796 1260 756783 0 0 |-> BEPixelBackend 587ec681f3Smrg 0.20 3.30 31521202 41 756783 0 0 |-> BESetup 597ec681f3Smrg 0.16 2.69 25624304 33 756783 0 0 |-> BEBarycentric 607ec681f3Smrg 0.18 2.92 27884986 36 756783 0 0 |-> BEEarlyDepthTest 617ec681f3Smrg 0.19 3.20 30564174 41 744058 0 0 |-> BEPixelShader 627ec681f3Smrg 0.26 4.30 41058646 55 744058 0 0 |-> BEOutputMerger 637ec681f3Smrg 1.27 20.94 199750822 32 6054264 0 0 |-> BEEndTile 647ec681f3Smrg 0.33 2.34 51758160 23687 2185 0 0 |-> BEStoreTiles 657ec681f3Smrg 0.20 60.22 31169500 28807 1082 0 0 |-> B8G8R8A8_UNORM 667ec681f3Smrg 0.00 0.00 302752 302752 1 0 0 WorkerWaitForThreadEvent 677ec681f3Smrg 68