17ec681f3SmrgPerfetto Tracing 27ec681f3Smrg================ 37ec681f3Smrg 47ec681f3SmrgMesa has experimental support for `Perfetto <https://perfetto.dev>`__ for 57ec681f3SmrgGPU performance monitoring. Perfetto supports multiple 67ec681f3Smrg`producers <https://perfetto.dev/docs/concepts/service-model>`__ each with 77ec681f3Smrgone or more data-sources. Perfetto already provides various producers and 87ec681f3Smrgdata-sources for things like: 97ec681f3Smrg 107ec681f3Smrg- CPU scheduling events (``linux.ftrace``) 117ec681f3Smrg- CPU frequency scaling (``linux.ftrace``) 127ec681f3Smrg- System calls (``linux.ftrace``) 137ec681f3Smrg- Process memory utilization (``linux.process_stats``) 147ec681f3Smrg 157ec681f3SmrgAs well as various domain specific producers. 167ec681f3Smrg 177ec681f3SmrgThe mesa perfetto support adds additional producers, to allow for visualizing 187ec681f3SmrgGPU performance (frequency, utilization, performance counters, etc) on the 197ec681f3Smrgsame timeline, to better understand and tune/debug system level performance: 207ec681f3Smrg 217ec681f3Smrg- pps-producer: A systemwide daemon that can collect global performance 227ec681f3Smrg counters. 237ec681f3Smrg- mesa: Per-process producer within mesa to capture render-stage traces 247ec681f3Smrg on the GPU timeline, track events, etc. 257ec681f3Smrg 267ec681f3SmrgThe exact supported features vary per driver: 277ec681f3Smrg 287ec681f3Smrg.. list-table:: Supported data-sources 297ec681f3Smrg :header-rows: 1 307ec681f3Smrg 317ec681f3Smrg * - Driver 327ec681f3Smrg - PPS Counters 337ec681f3Smrg - Render Stages 347ec681f3Smrg * - Freedreno 357ec681f3Smrg - ``gpu.counters.msm`` 367ec681f3Smrg - ``gpu.renderstages.msm`` 377ec681f3Smrg * - Turnip 387ec681f3Smrg - ``gpu.counters.msm`` 397ec681f3Smrg - 407ec681f3Smrg * - Intel 417ec681f3Smrg - ``gpu.counters.i915`` 427ec681f3Smrg - 437ec681f3Smrg * - Panfrost 447ec681f3Smrg - ``gpu.counters.panfrost`` 457ec681f3Smrg - 467ec681f3Smrg 477ec681f3SmrgRun 487ec681f3Smrg--- 497ec681f3Smrg 507ec681f3SmrgTo capture a trace with perfetto you need to take the following steps: 517ec681f3Smrg 527ec681f3Smrg1. Build perfetto from sources available at ``subprojects/perfetto`` following 537ec681f3Smrg `this guide <https://perfetto.dev/docs/quickstart/linux-tracing>`__. 547ec681f3Smrg 557ec681f3Smrg2. Create a `trace config <https://perfetto.dev/#/trace-config.md>`__, which is 567ec681f3Smrg a json formatted text file with extension ``.cfg``, or use one of the config 577ec681f3Smrg files under the ``src/tool/pps/cfg`` directory. More examples of config files 587ec681f3Smrg can be found in ``subprojects/perfetto/test/configs``. 597ec681f3Smrg 607ec681f3Smrg3. Change directory to ``subprojects/perfetto`` and run a 617ec681f3Smrg `convenience script <https://perfetto.dev/#/running.md>`__ to start the 627ec681f3Smrg tracing service: 637ec681f3Smrg 647ec681f3Smrg .. code-block:: console 657ec681f3Smrg 667ec681f3Smrg cd subprojects/perfetto 677ec681f3Smrg CONFIG=<path/to/gpu.cfg> OUT=out/linux_clang_release ./tools/tmux -n 687ec681f3Smrg 697ec681f3Smrg4. Start other producers you may need, e.g. ``pps-producer``. 707ec681f3Smrg 717ec681f3Smrg5. Start ``perfetto`` under the tmux session initiated in step 3. 727ec681f3Smrg 737ec681f3Smrg6. Once tracing has finished, you can detach from tmux with :kbd:`Ctrl+b`, 747ec681f3Smrg :kbd:`d`, and the convenience script should automatically copy the trace 757ec681f3Smrg files into ``$HOME/Downloads``. 767ec681f3Smrg 777ec681f3Smrg7. Go to `ui.perfetto.dev <https://ui.perfetto.dev>`__ and upload 787ec681f3Smrg ``$HOME/Downloads/trace.protobuf`` by clicking on **Open trace file**. 797ec681f3Smrg 807ec681f3Smrg8. Alternatively you can open the trace in `AGI <https://gpuinspector.dev/>`__ 817ec681f3Smrg (which despite the name can be used to view non-android traces). 827ec681f3Smrg 837ec681f3SmrgDriver Specifics 847ec681f3Smrg~~~~~~~~~~~~~~~~ 857ec681f3Smrg 867ec681f3SmrgBelow is driver specific information/instructions for the PPS producer. 877ec681f3Smrg 887ec681f3SmrgFreedreno / Turnip 897ec681f3Smrg^^^^^^^^^^^^^^^^^^ 907ec681f3Smrg 917ec681f3SmrgThe Freedreno PPS driver needs root access to read system-wide 927ec681f3Smrgperformance counters, so you can simply run it with sudo: 937ec681f3Smrg 947ec681f3Smrg.. code-block:: console 957ec681f3Smrg 967ec681f3Smrg sudo ./build/src/tool/pps/pps-producer 977ec681f3Smrg 987ec681f3SmrgIntel 997ec681f3Smrg^^^^^ 1007ec681f3Smrg 1017ec681f3SmrgThe Intel PPS driver needs root access to read system-wide 1027ec681f3Smrg`RenderBasic <https://software.intel.com/content/www/us/en/develop/documentation/vtune-help/top/reference/gpu-metrics-reference.html>`__ 1037ec681f3Smrgperformance counters, so you can simply run it with sudo: 1047ec681f3Smrg 1057ec681f3Smrg.. code-block:: console 1067ec681f3Smrg 1077ec681f3Smrg sudo ./build/src/tool/pps/pps-producer 1087ec681f3Smrg 1097ec681f3SmrgAnother option to enable access wide data without root permissions would be running the following: 1107ec681f3Smrg 1117ec681f3Smrg.. code-block:: console 1127ec681f3Smrg 1137ec681f3Smrg sudo sysctl dev.i915.perf_stream_paranoid=0 1147ec681f3Smrg 1157ec681f3SmrgAlternatively using the ``CAP_PERFMON`` permission on the binary should work too. 1167ec681f3Smrg 1177ec681f3SmrgPanfrost 1187ec681f3Smrg^^^^^^^^ 1197ec681f3Smrg 1207ec681f3SmrgThe Panfrost PPS driver uses unstable ioctls that behave correctly on 1217ec681f3Smrgkernel version `5.4.23+ <https://lwn.net/Articles/813601/>`__ and 1227ec681f3Smrg`5.5.7+ <https://lwn.net/Articles/813600/>`__. 1237ec681f3Smrg 1247ec681f3SmrgTo run the producer, follow these two simple steps: 1257ec681f3Smrg 1267ec681f3Smrg1. Enable Panfrost unstable ioctls via kernel parameter: 1277ec681f3Smrg 1287ec681f3Smrg .. code-block:: console 1297ec681f3Smrg 1307ec681f3Smrg modprobe panfrost unstable_ioctls=1 1317ec681f3Smrg 1327ec681f3Smrg Alternatively you could add ``panfrost.unstable_ioctls=1`` to your kernel command line, or ``echo 1 > /sys/module/panfrost/parameters/unstable_ioctls``. 1337ec681f3Smrg 1347ec681f3Smrg2. Run the producer: 1357ec681f3Smrg 1367ec681f3Smrg .. code-block:: console 1377ec681f3Smrg 1387ec681f3Smrg ./build/pps-producer 1397ec681f3Smrg 1407ec681f3SmrgTroubleshooting 1417ec681f3Smrg--------------- 1427ec681f3Smrg 1437ec681f3SmrgTmux 1447ec681f3Smrg~~~~ 1457ec681f3Smrg 1467ec681f3SmrgIf the convenience script ``tools/tmux`` keeps copying artifacts to your 1477ec681f3Smrg``SSH_TARGET`` without starting the tmux session, make sure you have ``tmux`` 1487ec681f3Smrginstalled in your system. 1497ec681f3Smrg 1507ec681f3Smrg.. code-block:: console 1517ec681f3Smrg 1527ec681f3Smrg apt install tmux 1537ec681f3Smrg 1547ec681f3SmrgMissing counter names 1557ec681f3Smrg~~~~~~~~~~~~~~~~~~~~~ 1567ec681f3Smrg 1577ec681f3SmrgIf the trace viewer shows a list of counters with a description like 1587ec681f3Smrg``gpu_counter(#)`` instead of their proper names, maybe you had a data loss due 1597ec681f3Smrgto the trace buffer being full and wrapped. 1607ec681f3Smrg 1617ec681f3SmrgIn order to prevent this loss of data you can tweak the trace config file in 1627ec681f3Smrgtwo different ways: 1637ec681f3Smrg 1647ec681f3Smrg- Increase the size of the buffer in use: 1657ec681f3Smrg 1667ec681f3Smrg .. code-block:: javascript 1677ec681f3Smrg 1687ec681f3Smrg buffers { 1697ec681f3Smrg size_kb: 2048, 1707ec681f3Smrg fill_policy: RING_BUFFER, 1717ec681f3Smrg } 1727ec681f3Smrg 1737ec681f3Smrg- Periodically flush the trace buffer into the output file: 1747ec681f3Smrg 1757ec681f3Smrg .. code-block:: javascript 1767ec681f3Smrg 1777ec681f3Smrg write_into_file: true 1787ec681f3Smrg file_write_period_ms: 250 1797ec681f3Smrg 1807ec681f3Smrg 1817ec681f3Smrg- Discard new traces when the buffer fills: 1827ec681f3Smrg 1837ec681f3Smrg .. code-block:: javascript 1847ec681f3Smrg 1857ec681f3Smrg buffers { 1867ec681f3Smrg size_kb: 2048, 1877ec681f3Smrg fill_policy: DISCARD, 1887ec681f3Smrg } 189