1848b8605Smrg<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> 2848b8605Smrg<html lang="en"> 3848b8605Smrg<head> 4848b8605Smrg <meta http-equiv="content-type" content="text/html; charset=utf-8"> 5848b8605Smrg <title>llvmpipe</title> 6848b8605Smrg <link rel="stylesheet" type="text/css" href="mesa.css"> 7848b8605Smrg</head> 8848b8605Smrg<body> 9848b8605Smrg 10848b8605Smrg<div class="header"> 11848b8605Smrg <h1>The Mesa 3D Graphics Library</h1> 12848b8605Smrg</div> 13848b8605Smrg 14848b8605Smrg<iframe src="contents.html"></iframe> 15848b8605Smrg<div class="content"> 16848b8605Smrg 17848b8605Smrg<h1>Introduction</h1> 18848b8605Smrg 19848b8605Smrg<p> 20848b8605SmrgThe Gallium llvmpipe driver is a software rasterizer that uses LLVM to 21848b8605Smrgdo runtime code generation. 22848b8605SmrgShaders, point/line/triangle rasterization and vertex processing are 23b8e80941Smrgimplemented with LLVM IR which is translated to x86, x86-64, or ppc64le machine 24848b8605Smrgcode. 25848b8605SmrgAlso, the driver is multithreaded to take advantage of multiple CPU cores 26848b8605Smrg(up to 8 at this time). 27848b8605SmrgIt's the fastest software rasterizer for Mesa. 28848b8605Smrg</p> 29848b8605Smrg 30848b8605Smrg 31848b8605Smrg<h1>Requirements</h1> 32848b8605Smrg 33848b8605Smrg<ul> 34848b8605Smrg<li> 35848b8605Smrg <p> 36b8e80941Smrg For x86 or amd64 processors, 64-bit mode is recommended. 37b8e80941Smrg Support for SSE2 is strongly encouraged. Support for SSE3 and SSE4.1 will 38848b8605Smrg yield the most efficient code. The fewer features the CPU has the more 39b8e80941Smrg likely it is that you will run into underperforming, buggy, or incomplete code. 40b8e80941Smrg </p> 41b8e80941Smrg <p> 42b8e80941Smrg For ppc64le processors, use of the Altivec feature (the Vector 43b8e80941Smrg Facility) is recommended if supported; use of the VSX feature (the 44b8e80941Smrg Vector-Scalar Facility) is recommended if supported AND Mesa is 45b8e80941Smrg built with LLVM version 4.0 or later. 46848b8605Smrg </p> 47848b8605Smrg <p> 48848b8605Smrg See /proc/cpuinfo to know what your CPU supports. 49848b8605Smrg </p> 50848b8605Smrg</li> 51848b8605Smrg<li> 52b8e80941Smrg <p>Unless otherwise stated, LLVM version 3.4 is recommended; 3.3 or later is required.</p> 53848b8605Smrg <p> 54848b8605Smrg For Linux, on a recent Debian based distribution do: 55848b8605Smrg </p> 56848b8605Smrg<pre> 57848b8605Smrg aptitude install llvm-dev 58848b8605Smrg</pre> 59b8e80941Smrg <p> 60b8e80941Smrg If you want development snapshot builds of LLVM for Debian and derived 61b8e80941Smrg distributions like Ubuntu, you can use the APT repository at <a 62b8e80941Smrg href="https://apt.llvm.org/" title="Debian Development packages for LLVM" 63b8e80941Smrg >apt.llvm.org</a>, which are maintained by Debian's LLVM maintainer. 64b8e80941Smrg </p> 65848b8605Smrg <p> 66848b8605Smrg For a RPM-based distribution do: 67848b8605Smrg </p> 68848b8605Smrg<pre> 69848b8605Smrg yum install llvm-devel 70848b8605Smrg</pre> 71848b8605Smrg 72848b8605Smrg <p> 73b8e80941Smrg For Windows you will need to build LLVM from source with MSVC or MINGW 74b8e80941Smrg (either natively or through cross compilers) and CMake, and set the LLVM 75b8e80941Smrg environment variable to the directory you installed it to. 76848b8605Smrg 77848b8605Smrg LLVM will be statically linked, so when building on MSVC it needs to be 78848b8605Smrg built with a matching CRT as Mesa, and you'll need to pass 79b8e80941Smrg <code>-DLLVM_USE_CRT_xxx=yyy</code> as described below. 80b8e80941Smrg </p> 81b8e80941Smrg 82b8e80941Smrg <table border="1"> 83b8e80941Smrg <tr> 84b8e80941Smrg <th rowspan="2">LLVM build-type</th> 85b8e80941Smrg <th colspan="2" align="center">Mesa build-type</th> 86b8e80941Smrg </tr> 87b8e80941Smrg <tr> 88b8e80941Smrg <th>debug,checked</th> 89b8e80941Smrg <th>release,profile</th> 90b8e80941Smrg </tr> 91b8e80941Smrg <tr> 92b8e80941Smrg <th>Debug</th> 93b8e80941Smrg <td><code>-DLLVM_USE_CRT_DEBUG=MTd</code></td> 94b8e80941Smrg <td><code>-DLLVM_USE_CRT_DEBUG=MT</code></td> 95b8e80941Smrg </tr> 96b8e80941Smrg <tr> 97b8e80941Smrg <th>Release</th> 98b8e80941Smrg <td><code>-DLLVM_USE_CRT_RELEASE=MTd</code></td> 99b8e80941Smrg <td><code>-DLLVM_USE_CRT_RELEASE=MT</code></td> 100b8e80941Smrg </tr> 101b8e80941Smrg </table> 102848b8605Smrg 103b8e80941Smrg <p> 104848b8605Smrg You can build only the x86 target by passing -DLLVM_TARGETS_TO_BUILD=X86 105848b8605Smrg to cmake. 106848b8605Smrg </p> 107848b8605Smrg</li> 108848b8605Smrg 109848b8605Smrg<li> 110848b8605Smrg <p>scons (optional)</p> 111848b8605Smrg</li> 112848b8605Smrg</ul> 113848b8605Smrg 114848b8605Smrg 115848b8605Smrg<h1>Building</h1> 116848b8605Smrg 117848b8605SmrgTo build everything on Linux invoke scons as: 118848b8605Smrg 119848b8605Smrg<pre> 120848b8605Smrg scons build=debug libgl-xlib 121848b8605Smrg</pre> 122848b8605Smrg 123b8e80941SmrgAlternatively, you can build it with meson with: 124848b8605Smrg<pre> 125b8e80941Smrg mkdir build 126b8e80941Smrg cd build 127b8e80941Smrg meson -D glx=gallium-xlib -D gallium-drivers=swrast 128b8e80941Smrg ninja 129848b8605Smrg</pre> 130848b8605Smrg 131848b8605Smrgbut the rest of these instructions assume that scons is used. 132848b8605Smrg 133848b8605SmrgFor Windows the procedure is similar except the target: 134848b8605Smrg 135848b8605Smrg<pre> 136848b8605Smrg scons platform=windows build=debug libgl-gdi 137848b8605Smrg</pre> 138848b8605Smrg 139848b8605Smrg 140848b8605Smrg<h1>Using</h1> 141848b8605Smrg 142848b8605Smrg<h2>Linux</h2> 143848b8605Smrg 144848b8605Smrg<p>On Linux, building will create a drop-in alternative for libGL.so into</p> 145848b8605Smrg 146848b8605Smrg<pre> 147848b8605Smrg build/foo/gallium/targets/libgl-xlib/libGL.so 148848b8605Smrg</pre> 149848b8605Smrgor 150848b8605Smrg<pre> 151848b8605Smrg lib/gallium/libGL.so 152848b8605Smrg</pre> 153848b8605Smrg 154848b8605Smrg<p>To use it set the LD_LIBRARY_PATH environment variable accordingly.</p> 155848b8605Smrg 156848b8605Smrg<p>For performance evaluation pass build=release to scons, and use the corresponding 157848b8605Smrglib directory without the "-debug" suffix.</p> 158848b8605Smrg 159848b8605Smrg 160848b8605Smrg<h2>Windows</h2> 161848b8605Smrg 162848b8605Smrg<p> 163848b8605SmrgOn Windows, building will create 164848b8605Smrg<code>build/windows-x86-debug/gallium/targets/libgl-gdi/opengl32.dll</code> 165848b8605Smrgwhich is a drop-in alternative for system's <code>opengl32.dll</code>. To use 166848b8605Smrgit put it in the same directory as your application. It can also be used by 167848b8605Smrgreplacing the native ICD driver, but it's quite an advanced usage, so if you 168848b8605Smrgneed to ask, don't even try it. 169848b8605Smrg</p> 170848b8605Smrg 171848b8605Smrg<p> 172848b8605SmrgThere is however an easy way to replace the OpenGL software renderer that comes 173848b8605Smrgwith Microsoft Windows 7 (or later) with llvmpipe (that is, on systems without 174848b8605Smrgany OpenGL drivers): 175848b8605Smrg</p> 176848b8605Smrg 177848b8605Smrg<ul> 178848b8605Smrg <li><p>copy build/windows-x86-debug/gallium/targets/libgl-gdi/opengl32.dll to C:\Windows\SysWOW64\mesadrv.dll</p></li> 179848b8605Smrg <li><p>load this registry settings:</p> 180848b8605Smrg <pre>REGEDIT4 181848b8605Smrg 182b8e80941Smrg; https://technet.microsoft.com/en-us/library/cc749368.aspx 183b8e80941Smrg; https://www.msfn.org/board/topic/143241-portable-windows-7-build-from-winpe-30/page-5#entry942596 184848b8605Smrg[HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Windows NT\CurrentVersion\OpenGLDrivers\MSOGL] 185848b8605Smrg"DLL"="mesadrv.dll" 186848b8605Smrg"DriverVersion"=dword:00000001 187848b8605Smrg"Flags"=dword:00000001 188848b8605Smrg"Version"=dword:00000002 189848b8605Smrg</pre> 190848b8605Smrg </li> 191848b8605Smrg <li>Ditto for 64 bits drivers if you need them.</li> 192848b8605Smrg</ul> 193848b8605Smrg 194848b8605Smrg 195848b8605Smrg<h1>Profiling</h1> 196848b8605Smrg 197848b8605Smrg<p> 198848b8605SmrgTo profile llvmpipe you should build as 199848b8605Smrg</p> 200848b8605Smrg<pre> 201848b8605Smrg scons build=profile <same-as-before> 202848b8605Smrg</pre> 203848b8605Smrg 204848b8605Smrg<p> 205848b8605SmrgThis will ensure that frame pointers are used both in C and JIT functions, and 206848b8605Smrgthat no tail call optimizations are done by gcc. 207848b8605Smrg</p> 208848b8605Smrg 209848b8605Smrg<h2>Linux perf integration</h2> 210848b8605Smrg 211848b8605Smrg<p> 212b8e80941SmrgOn Linux, it is possible to have symbol resolution of JIT code with <a href="https://perf.wiki.kernel.org/">Linux perf</a>: 213848b8605Smrg</p> 214848b8605Smrg 215848b8605Smrg<pre> 216848b8605Smrg perf record -g /my/application 217848b8605Smrg perf report 218848b8605Smrg</pre> 219848b8605Smrg 220848b8605Smrg<p> 221848b8605SmrgWhen run inside Linux perf, llvmpipe will create a /tmp/perf-XXXXX.map file with 222848b8605Smrgsymbol address table. It also dumps assembly code to /tmp/perf-XXXXX.map.asm, 223b8e80941Smrgwhich can be used by the bin/perf-annotate-jit.py script to produce disassembly of 224848b8605Smrgthe generated code annotated with the samples. 225848b8605Smrg</p> 226848b8605Smrg 227848b8605Smrg<p>You can obtain a call graph via 228b8e80941Smrg<a href="https://github.com/jrfonseca/gprof2dot#linux-perf">Gprof2Dot</a>.</p> 229848b8605Smrg 230848b8605Smrg 231848b8605Smrg<h1>Unit testing</h1> 232848b8605Smrg 233848b8605Smrg<p> 234848b8605SmrgBuilding will also create several unit tests in 235848b8605Smrgbuild/linux-???-debug/gallium/drivers/llvmpipe: 236848b8605Smrg</p> 237848b8605Smrg 238848b8605Smrg<ul> 239848b8605Smrg<li> lp_test_blend: blending 240848b8605Smrg<li> lp_test_conv: SIMD vector conversion 241848b8605Smrg<li> lp_test_format: pixel unpacking/packing 242848b8605Smrg</ul> 243848b8605Smrg 244848b8605Smrg<p> 245b8e80941SmrgSome of these tests can output results and benchmarks to a tab-separated file 246b8e80941Smrgfor later analysis, e.g.: 247848b8605Smrg</p> 248848b8605Smrg<pre> 249848b8605Smrg build/linux-x86_64-debug/gallium/drivers/llvmpipe/lp_test_blend -o blend.tsv 250848b8605Smrg</pre> 251848b8605Smrg 252848b8605Smrg 253848b8605Smrg<h1>Development Notes</h1> 254848b8605Smrg 255848b8605Smrg<ul> 256848b8605Smrg<li> 257b8e80941Smrg When looking at this code for the first time, start in lp_state_fs.c, and 258b8e80941Smrg then skim through the lp_bld_* functions called there, and the comments 259848b8605Smrg at the top of the lp_bld_*.c functions. 260848b8605Smrg</li> 261848b8605Smrg<li> 262848b8605Smrg The driver-independent parts of the LLVM / Gallium code are found in 263848b8605Smrg src/gallium/auxiliary/gallivm/. The filenames and function prefixes 264848b8605Smrg need to be renamed from "lp_bld_" to something else though. 265848b8605Smrg</li> 266848b8605Smrg<li> 267848b8605Smrg We use LLVM-C bindings for now. They are not documented, but follow the C++ 268848b8605Smrg interfaces very closely, and appear to be complete enough for code 269848b8605Smrg generation. See 270b8e80941Smrg <a href="https://npcontemplation.blogspot.com/2008/06/secret-of-llvm-c-bindings.html"> 271848b8605Smrg this stand-alone example</a>. See the llvm-c/Core.h file for reference. 272848b8605Smrg</li> 273848b8605Smrg</ul> 274848b8605Smrg 275848b8605Smrg<h1 id="recommended_reading">Recommended Reading</h1> 276848b8605Smrg 277848b8605Smrg<ul> 278848b8605Smrg <li> 279848b8605Smrg <p>Rasterization</p> 280848b8605Smrg <ul> 281b8e80941Smrg <li><a href="https://www.cs.unc.edu/~olano/papers/2dh-tri/">Triangle Scan Conversion using 2D Homogeneous Coordinates</a></li> 282848b8605Smrg <li><a href="http://www.drdobbs.com/parallel/rasterization-on-larrabee/217200602">Rasterization on Larrabee</a> (<a href="http://devmaster.net/posts/2887/rasterization-on-larrabee">DevMaster copy</a>)</li> 283848b8605Smrg <li><a href="http://devmaster.net/posts/6133/rasterization-using-half-space-functions">Rasterization using half-space functions</a></li> 284848b8605Smrg <li><a href="http://devmaster.net/posts/6145/advanced-rasterization">Advanced Rasterization</a></li> 285b8e80941Smrg <li><a href="https://fgiesen.wordpress.com/2013/02/17/optimizing-sw-occlusion-culling-index/">Optimizing Software Occlusion Culling</a></li> 286848b8605Smrg </ul> 287848b8605Smrg </li> 288848b8605Smrg <li> 289848b8605Smrg <p>Texture sampling</p> 290848b8605Smrg <ul> 291848b8605Smrg <li><a href="http://chrishecker.com/Miscellaneous_Technical_Articles#Perspective_Texture_Mapping">Perspective Texture Mapping</a></li> 292b8e80941Smrg <li><a href="https://www.flipcode.com/archives/Texturing_As_In_Unreal.shtml">Texturing As In Unreal</a></li> 293848b8605Smrg <li><a href="http://www.gamasutra.com/view/feature/3301/runtime_mipmap_filtering.php">Run-Time MIP-Map Filtering</a></li> 294848b8605Smrg <li><a href="http://alt.3dcenter.org/artikel/2003/10-26_a_english.php">Will "brilinear" filtering persist?</a></li> 295848b8605Smrg <li><a href="http://ixbtlabs.com/articles2/gffx/nv40-rx800-3.html">Trilinear filtering</a></li> 296848b8605Smrg <li><a href="http://devmaster.net/posts/12785/texture-swizzling">Texture Swizzling</a></li> 297848b8605Smrg </ul> 298848b8605Smrg </li> 299848b8605Smrg <li> 300848b8605Smrg <p>SIMD</p> 301848b8605Smrg <ul> 302848b8605Smrg <li><a href="http://www.cdl.uni-saarland.de/projects/wfv/#header4">Whole-Function Vectorization</a></li> 303848b8605Smrg </ul> 304848b8605Smrg </li> 305848b8605Smrg <li> 306848b8605Smrg <p>Optimization</p> 307848b8605Smrg <ul> 308848b8605Smrg <li><a href="http://www.drdobbs.com/optimizing-pixomatic-for-modern-x86-proc/184405807">Optimizing Pixomatic For Modern x86 Processors</a></li> 309848b8605Smrg <li><a href="http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html">Intel 64 and IA-32 Architectures Optimization Reference Manual</a></li> 310848b8605Smrg <li><a href="http://www.agner.org/optimize/">Software optimization resources</a></li> 311b8e80941Smrg <li><a href="https://software.intel.com/en-us/articles/intel-intrinsics-guide">Intel Intrinsics Guide</a></li> 312848b8605Smrg </ul> 313848b8605Smrg </li> 314848b8605Smrg <li> 315848b8605Smrg <p>LLVM</p> 316848b8605Smrg <ul> 317848b8605Smrg <li><a href="http://llvm.org/docs/LangRef.html">LLVM Language Reference Manual</a></li> 318b8e80941Smrg <li><a href="https://npcontemplation.blogspot.co.uk/2008/06/secret-of-llvm-c-bindings.html">The secret of LLVM C bindings</a></li> 319848b8605Smrg </ul> 320848b8605Smrg </li> 321848b8605Smrg <li> 322848b8605Smrg <p>General</p> 323848b8605Smrg <ul> 324b8e80941Smrg <li><a href="https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/">A trip through the Graphics Pipeline</a></li> 325b8e80941Smrg <li><a href="https://msdn.microsoft.com/en-us/library/gg615082.aspx#architecture">WARP Architecture and Performance</a></li> 326848b8605Smrg </ul> 327848b8605Smrg </li> 328848b8605Smrg</ul> 329848b8605Smrg 330848b8605Smrg</div> 331848b8605Smrg</body> 332848b8605Smrg</html> 333