17ec681f3SmrgContinuous Integration
27ec681f3Smrg======================
37ec681f3Smrg
47ec681f3SmrgGitLab CI
57ec681f3Smrg---------
67ec681f3Smrg
77ec681f3SmrgGitLab provides a convenient framework for running commands in response to Git pushes.
87ec681f3SmrgWe use it to test merge requests (MRs) before merging them (pre-merge testing),
97ec681f3Smrgas well as post-merge testing, for everything that hits ``main``
107ec681f3Smrg(this is necessary because we still allow commits to be pushed outside of MRs,
117ec681f3Smrgand even then the MR CI runs in the forked repository, which might have been
127ec681f3Smrgmodified and thus is unreliable).
137ec681f3Smrg
147ec681f3SmrgThe CI runs a number of tests, from trivial build-testing to complex GPU rendering:
157ec681f3Smrg
167ec681f3Smrg- Build testing for a number of build systems, configurations and platforms
177ec681f3Smrg- Sanity checks (``meson test``)
187ec681f3Smrg- Some drivers (softpipe, llvmpipe, freedreno and panfrost) are also tested
197ec681f3Smrg  using `VK-GL-CTS <https://github.com/KhronosGroup/VK-GL-CTS>`__
207ec681f3Smrg- Replay of application traces
217ec681f3Smrg
227ec681f3SmrgA typical run takes between 20 and 30 minutes, although it can go up very quickly
237ec681f3Smrgif the GitLab runners are overwhelmed, which happens sometimes. When it does happen,
247ec681f3Smrgnot much can be done besides waiting it out, or cancel it.
257ec681f3Smrg
267ec681f3SmrgDue to limited resources, we currently do not run the CI automatically
277ec681f3Smrgon every push; instead, we only run it automatically once the MR has
287ec681f3Smrgbeen assigned to ``Marge``, our merge bot.
297ec681f3Smrg
307ec681f3SmrgIf you're interested in the details, the main configuration file is ``.gitlab-ci.yml``,
317ec681f3Smrgand it references a number of other files in ``.gitlab-ci/``.
327ec681f3Smrg
337ec681f3SmrgIf the GitLab CI doesn't seem to be running on your fork (or MRs, as they run
347ec681f3Smrgin the context of your fork), you should check the "Settings" of your fork.
357ec681f3SmrgUnder "CI / CD" → "General pipelines", make sure "Custom CI config path" is
367ec681f3Smrgempty (or set to the default ``.gitlab-ci.yml``), and that the
377ec681f3Smrg"Public pipelines" box is checked.
387ec681f3Smrg
397ec681f3SmrgIf you're having issues with the GitLab CI, your best bet is to ask
407ec681f3Smrgabout it on ``#freedesktop`` on OFTC and tag `Daniel Stone
417ec681f3Smrg<https://gitlab.freedesktop.org/daniels>`__ (``daniels`` on IRC) or
427ec681f3Smrg`Eric Anholt <https://gitlab.freedesktop.org/anholt>`__ (``anholt`` on
437ec681f3SmrgIRC).
447ec681f3Smrg
457ec681f3SmrgThe three GitLab CI systems currently integrated are:
467ec681f3Smrg
477ec681f3Smrg
487ec681f3Smrg.. toctree::
497ec681f3Smrg   :maxdepth: 1
507ec681f3Smrg
517ec681f3Smrg   bare-metal
527ec681f3Smrg   LAVA
537ec681f3Smrg   docker
547ec681f3Smrg
557ec681f3SmrgApplication traces replay
567ec681f3Smrg-------------------------
577ec681f3Smrg
587ec681f3SmrgThe CI replays application traces with various drivers in two different jobs. The first
597ec681f3Smrgjob replays traces listed in ``src/<driver>/ci/traces-<driver>.yml`` files and if any
607ec681f3Smrgof those traces fail the pipeline fails as well. The second job replays traces listed in
617ec681f3Smrg``src/<driver>/ci/restricted-traces-<driver>.yml`` and it is allowed to fail. This second
627ec681f3Smrgjob is only created when the pipeline is triggered by `marge-bot` or any other user that
637ec681f3Smrghas been granted access to these traces.
647ec681f3Smrg
657ec681f3SmrgA traces YAML file also includes a ``download-url`` pointing to a MinIO
667ec681f3Smrginstance where to download the traces from. While the first job should always work with
677ec681f3Smrgpublicly accessible traces, the second job could point to an url with restricted access.
687ec681f3Smrg
697ec681f3SmrgRestricted traces are those that have been made available to Mesa developers without a
707ec681f3Smrglicense to redistribute at will, and thus should not be exposed to the public. Failing to
717ec681f3Smrgaccess that URL would not prevent the pipeline to pass, therefore forks made by
727ec681f3Smrgcontributors without permissions to download non-redistributable traces can be merged
737ec681f3Smrgwithout friction.
747ec681f3Smrg
757ec681f3SmrgAs an aside, only maintainers of such non-redistributable traces are responsible for
767ec681f3Smrgensuring that replays are successful, since other contributors would not be able to
777ec681f3Smrgdownload and test them by themselves.
787ec681f3Smrg
797ec681f3SmrgThose Mesa contributors that believe they could have permission to access such
807ec681f3Smrgnon-redistributable traces can request permission to Daniel Stone <daniels@collabora.com>.
817ec681f3Smrg
827ec681f3Smrggitlab.freedesktop.org accounts that are to be granted access to these traces will be
837ec681f3Smrgadded to the OPA policy for the MinIO repository as per
847ec681f3Smrghttps://gitlab.freedesktop.org/freedesktop/helm-gitlab-config/-/commit/a3cd632743019f68ac8a829267deb262d9670958 .
857ec681f3Smrg
867ec681f3SmrgSo the jobs are created in personal repositories, the name of the user's account needs
877ec681f3Smrgto be added to the rules attribute of the Gitlab CI job that accesses the restricted
887ec681f3Smrgaccounts.
897ec681f3Smrg
907ec681f3SmrgIntel CI
917ec681f3Smrg--------
927ec681f3Smrg
937ec681f3SmrgThe Intel CI is not yet integrated into the GitLab CI.
947ec681f3SmrgFor now, special access must be manually given (file a issue in
957ec681f3Smrg`the Intel CI configuration repo <https://gitlab.freedesktop.org/Mesa_CI/mesa_jenkins>`__
967ec681f3Smrgif you think you or Mesa would benefit from you having access to the Intel CI).
977ec681f3SmrgResults can be seen on `mesa-ci.01.org <https://mesa-ci.01.org>`__
987ec681f3Smrgif you are *not* an Intel employee, but if you are you
997ec681f3Smrgcan access a better interface on
1007ec681f3Smrg`mesa-ci-results.jf.intel.com <http://mesa-ci-results.jf.intel.com>`__.
1017ec681f3Smrg
1027ec681f3SmrgThe Intel CI runs a much larger array of tests, on a number of generations
1037ec681f3Smrgof Intel hardware and on multiple platforms (X11, Wayland, DRM & Android),
1047ec681f3Smrgwith the purpose of detecting regressions.
1057ec681f3SmrgTests include
1067ec681f3Smrg`Crucible <https://gitlab.freedesktop.org/mesa/crucible>`__,
1077ec681f3Smrg`VK-GL-CTS <https://github.com/KhronosGroup/VK-GL-CTS>`__,
1087ec681f3Smrg`dEQP <https://android.googlesource.com/platform/external/deqp>`__,
1097ec681f3Smrg`Piglit <https://gitlab.freedesktop.org/mesa/piglit>`__,
1107ec681f3Smrg`Skia <https://skia.googlesource.com/skia>`__,
1117ec681f3Smrg`VkRunner <https://github.com/Igalia/vkrunner>`__,
1127ec681f3Smrg`WebGL <https://github.com/KhronosGroup/WebGL>`__,
1137ec681f3Smrgand a few other tools.
1147ec681f3SmrgA typical run takes between 30 minutes and an hour.
1157ec681f3Smrg
1167ec681f3SmrgIf you're having issues with the Intel CI, your best bet is to ask about
1177ec681f3Smrgit on ``#dri-devel`` on OFTC and tag `Nico Cortes
1187ec681f3Smrg<https://gitlab.freedesktop.org/ngcortes>`__ (``ngcortes`` on IRC).
1197ec681f3Smrg
1207ec681f3Smrg.. _CI-farm-expectations:
1217ec681f3Smrg
1227ec681f3SmrgCI farm expectations
1237ec681f3Smrg--------------------
1247ec681f3Smrg
1257ec681f3SmrgTo make sure that testing of one vendor's drivers doesn't block
1267ec681f3Smrgunrelated work by other vendors, we require that a given driver's test
1277ec681f3Smrgfarm produces a spurious failure no more than once a week.  If every
1287ec681f3Smrgdriver had CI and failed once a week, we would be seeing someone's
1297ec681f3Smrgcode getting blocked on a spurious failure daily, which is an
1307ec681f3Smrgunacceptable cost to the project.
1317ec681f3Smrg
1327ec681f3SmrgAdditionally, the test farm needs to be able to provide a short enough
1337ec681f3Smrgturnaround time that we can get our MRs through marge-bot without the
1347ec681f3Smrgpipeline backing up.  As a result, we require that the test farm be
1357ec681f3Smrgable to handle a whole pipeline's worth of jobs in less than 15 minutes
1367ec681f3Smrg(to compare, the build stage is about 10 minutes).
1377ec681f3Smrg
1387ec681f3SmrgIf a test farm is short the HW to provide these guarantees, consider dropping
1397ec681f3Smrgtests to reduce runtime.  dEQP job logs print the slowest tests at the end of
1407ec681f3Smrgthe run, and piglit logs the runtime of tests in the results.json.bz2 in the
1417ec681f3Smrgartifacts.  Or, you can add the following to your job to only run some fraction
1427ec681f3Smrg(in this case, 1/10th) of the deqp tests.
1437ec681f3Smrg
1447ec681f3Smrg.. code-block:: yaml
1457ec681f3Smrg
1467ec681f3Smrg    variables:
1477ec681f3Smrg      DEQP_FRACTION: 10
1487ec681f3Smrg
1497ec681f3Smrgto just run 1/10th of the test list.
1507ec681f3Smrg
1517ec681f3SmrgIf a HW CI farm goes offline (network dies and all CI pipelines end up
1527ec681f3Smrgstalled) or its runners are consistently spuriously failing (disk
1537ec681f3Smrgfull?), and the maintainer is not immediately available to fix the
1547ec681f3Smrgissue, please push through an MR disabling that farm's jobs by adding
1557ec681f3Smrg'.' to the front of the jobs names until the maintainer can bring
1567ec681f3Smrgthings back up.  If this happens, the farm maintainer should provide a
1577ec681f3Smrgreport to mesa-dev@lists.freedesktop.org after the fact explaining
1587ec681f3Smrgwhat happened and what the mitigation plan is for that failure next
1597ec681f3Smrgtime.
1607ec681f3Smrg
1617ec681f3SmrgPersonal runners
1627ec681f3Smrg----------------
1637ec681f3Smrg
1647ec681f3SmrgMesa's CI is currently run primarily on packet.net's m1xlarge nodes
1657ec681f3Smrg(2.2Ghz Sandy Bridge), with each job getting 8 cores allocated.  You
1667ec681f3Smrgcan speed up your personal CI builds (and marge-bot merges) by using a
1677ec681f3Smrgfaster personal machine as a runner.  You can find the gitlab-runner
1687ec681f3Smrgpackage in Debian, or use GitLab's own builds.
1697ec681f3Smrg
1707ec681f3SmrgTo do so, follow `GitLab's instructions
1717ec681f3Smrg<https://docs.gitlab.com/ce/ci/runners/#create-a-specific-runner>`__ to
1727ec681f3Smrgregister your personal GitLab runner in your Mesa fork.  Then, tell
1737ec681f3SmrgMesa how many jobs it should serve (``concurrent=``) and how many
1747ec681f3Smrgcores those jobs should use (``FDO_CI_CONCURRENT=``) by editing these
1757ec681f3Smrglines in ``/etc/gitlab-runner/config.toml``, for example::
1767ec681f3Smrg
1777ec681f3Smrg  concurrent = 2
1787ec681f3Smrg
1797ec681f3Smrg  [[runners]]
1807ec681f3Smrg    environment = ["FDO_CI_CONCURRENT=16"]
1817ec681f3Smrg
1827ec681f3Smrg
1837ec681f3SmrgDocker caching
1847ec681f3Smrg--------------
1857ec681f3Smrg
1867ec681f3SmrgThe CI system uses Docker images extensively to cache
1877ec681f3Smrginfrequently-updated build content like the CTS.  The `freedesktop.org
1887ec681f3SmrgCI templates
1897ec681f3Smrg<https://gitlab.freedesktop.org/freedesktop/ci-templates/>`_ help us
1907ec681f3Smrgmanage the building of the images to reduce how frequently rebuilds
1917ec681f3Smrghappen, and trim down the images (stripping out manpages, cleaning the
1927ec681f3Smrgapt cache, and other such common pitfalls of building Docker images).
1937ec681f3Smrg
1947ec681f3SmrgWhen running a container job, the templates will look for an existing
1957ec681f3Smrgbuild of that image in the container registry under
1967ec681f3Smrg``MESA_IMAGE_TAG``.  If it's found it will be reused, and if
1977ec681f3Smrgnot, the associated `.gitlab-ci/containers/<jobname>.sh`` will be run
1987ec681f3Smrgto build it.  So, when developing any change to container build
1997ec681f3Smrgscripts, you need to update the associated ``MESA_IMAGE_TAG`` to
2007ec681f3Smrga new unique string.  We recommend using the current date plus some
2017ec681f3Smrgstring related to your branch (so that if you rebase on someone else's
2027ec681f3Smrgcontainer update from the same day, you will get a Git conflict
2037ec681f3Smrginstead of silently reusing their container)
2047ec681f3Smrg
2057ec681f3SmrgWhen developing a given change to your Docker image, you would have to
2067ec681f3Smrgbump the tag on each ``git commit --amend`` to your development
2077ec681f3Smrgbranch, which can get tedious.  Instead, you can navigate to the
2087ec681f3Smrg`container registry
2097ec681f3Smrg<https://gitlab.freedesktop.org/mesa/mesa/container_registry>`_ for
2107ec681f3Smrgyour repository and delete the tag to force a rebuild.  When your code
2117ec681f3Smrgis eventually merged to main, a full image rebuild will occur again
2127ec681f3Smrg(forks inherit images from the main repo, but MRs don't propagate
2137ec681f3Smrgimages from the fork into the main repo's registry).
2147ec681f3Smrg
2157ec681f3SmrgBuilding locally using CI docker images
2167ec681f3Smrg---------------------------------------
2177ec681f3Smrg
2187ec681f3SmrgIt can be frustrating to debug build failures on an environment you
2197ec681f3Smrgdon't personally have.  If you're experiencing this with the CI
2207ec681f3Smrgbuilds, you can use Docker to use their build environment locally.  Go
2217ec681f3Smrgto your job log, and at the top you'll see a line like::
2227ec681f3Smrg
2237ec681f3Smrg    Pulling docker image registry.freedesktop.org/anholt/mesa/debian/android_build:2020-09-11
2247ec681f3Smrg
2257ec681f3SmrgWe'll use a volume mount to make our current Mesa tree be what the
2267ec681f3SmrgDocker container uses, so they'll share everything (their build will
2277ec681f3Smrggo in _build, according to ``meson-build.sh``).  We're going to be
2287ec681f3Smrgusing the image non-interactively so we use ``run --rm $IMAGE
2297ec681f3Smrgcommand`` instead of ``run -it $IMAGE bash`` (which you may also find
2307ec681f3Smrguseful for debug).  Extract your build setup variables from
2317ec681f3Smrg.gitlab-ci.yml and run the CI meson build script:
2327ec681f3Smrg
2337ec681f3Smrg.. code-block:: console
2347ec681f3Smrg
2357ec681f3Smrg    IMAGE=registry.freedesktop.org/anholt/mesa/debian/android_build:2020-09-11
2367ec681f3Smrg    sudo docker pull $IMAGE
2377ec681f3Smrg    sudo docker run --rm -v `pwd`:/mesa -w /mesa $IMAGE env PKG_CONFIG_PATH=/usr/local/lib/aarch64-linux-android/pkgconfig/:/android-ndk-r21d/toolchains/llvm/prebuilt/linux-x86_64/sysroot/usr/lib/aarch64-linux-android/pkgconfig/ GALLIUM_DRIVERS=freedreno UNWIND=disabled EXTRA_OPTION="-D android-stub=true -D llvm=disabled" DRI_LOADERS="-D glx=disabled -D gbm=disabled -D egl=enabled -D platforms=android" CROSS=aarch64-linux-android ./.gitlab-ci/meson-build.sh
2387ec681f3Smrg
2397ec681f3SmrgAll you have left over from the build is its output, and a _build
2407ec681f3Smrgdirectory.  You can hack on mesa and iterate testing the build with:
2417ec681f3Smrg
2427ec681f3Smrg.. code-block:: console
2437ec681f3Smrg
2447ec681f3Smrg    sudo docker run --rm -v `pwd`:/mesa $IMAGE ninja -C /mesa/_build
245