llvmpipe.html revision 848b8605
1848b8605Smrg<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> 2848b8605Smrg<html lang="en"> 3848b8605Smrg<head> 4848b8605Smrg <meta http-equiv="content-type" content="text/html; charset=utf-8"> 5848b8605Smrg <title>llvmpipe</title> 6848b8605Smrg <link rel="stylesheet" type="text/css" href="mesa.css"> 7848b8605Smrg</head> 8848b8605Smrg<body> 9848b8605Smrg 10848b8605Smrg<div class="header"> 11848b8605Smrg <h1>The Mesa 3D Graphics Library</h1> 12848b8605Smrg</div> 13848b8605Smrg 14848b8605Smrg<iframe src="contents.html"></iframe> 15848b8605Smrg<div class="content"> 16848b8605Smrg 17848b8605Smrg<h1>Introduction</h1> 18848b8605Smrg 19848b8605Smrg<p> 20848b8605SmrgThe Gallium llvmpipe driver is a software rasterizer that uses LLVM to 21848b8605Smrgdo runtime code generation. 22848b8605SmrgShaders, point/line/triangle rasterization and vertex processing are 23848b8605Smrgimplemented with LLVM IR which is translated to x86 or x86-64 machine 24848b8605Smrgcode. 25848b8605SmrgAlso, the driver is multithreaded to take advantage of multiple CPU cores 26848b8605Smrg(up to 8 at this time). 27848b8605SmrgIt's the fastest software rasterizer for Mesa. 28848b8605Smrg</p> 29848b8605Smrg 30848b8605Smrg 31848b8605Smrg<h1>Requirements</h1> 32848b8605Smrg 33848b8605Smrg<ul> 34848b8605Smrg<li> 35848b8605Smrg <p>An x86 or amd64 processor; 64-bit mode recommended.</p> 36848b8605Smrg <p> 37848b8605Smrg Support for SSE2 is strongly encouraged. Support for SSSE3 and SSE4.1 will 38848b8605Smrg yield the most efficient code. The fewer features the CPU has the more 39848b8605Smrg likely is that you run into underperforming, buggy, or incomplete code. 40848b8605Smrg </p> 41848b8605Smrg <p> 42848b8605Smrg See /proc/cpuinfo to know what your CPU supports. 43848b8605Smrg </p> 44848b8605Smrg</li> 45848b8605Smrg<li> 46848b8605Smrg <p>LLVM: version 3.4 recommended; 3.1 or later required.</p> 47848b8605Smrg <p> 48848b8605Smrg For Linux, on a recent Debian based distribution do: 49848b8605Smrg </p> 50848b8605Smrg<pre> 51848b8605Smrg aptitude install llvm-dev 52848b8605Smrg</pre> 53848b8605Smrg <p> 54848b8605Smrg For a RPM-based distribution do: 55848b8605Smrg </p> 56848b8605Smrg<pre> 57848b8605Smrg yum install llvm-devel 58848b8605Smrg</pre> 59848b8605Smrg 60848b8605Smrg <p> 61848b8605Smrg For Windows you will need to build LLVM from source with MSVC or MINGW 62848b8605Smrg (either natively or through cross compilers) and CMake, and set the LLVM 63848b8605Smrg environment variable to the directory you installed it to. 64848b8605Smrg 65848b8605Smrg LLVM will be statically linked, so when building on MSVC it needs to be 66848b8605Smrg built with a matching CRT as Mesa, and you'll need to pass 67848b8605Smrg -DLLVM_USE_CRT_RELEASE=MTd for debug and checked builds, 68848b8605Smrg -DLLVM_USE_CRT_RELEASE=MTd for profile and release builds. 69848b8605Smrg 70848b8605Smrg You can build only the x86 target by passing -DLLVM_TARGETS_TO_BUILD=X86 71848b8605Smrg to cmake. 72848b8605Smrg </p> 73848b8605Smrg</li> 74848b8605Smrg 75848b8605Smrg<li> 76848b8605Smrg <p>scons (optional)</p> 77848b8605Smrg</li> 78848b8605Smrg</ul> 79848b8605Smrg 80848b8605Smrg 81848b8605Smrg<h1>Building</h1> 82848b8605Smrg 83848b8605SmrgTo build everything on Linux invoke scons as: 84848b8605Smrg 85848b8605Smrg<pre> 86848b8605Smrg scons build=debug libgl-xlib 87848b8605Smrg</pre> 88848b8605Smrg 89848b8605SmrgAlternatively, you can build it with GNU make, if you prefer, by invoking it as 90848b8605Smrg 91848b8605Smrg<pre> 92848b8605Smrg make linux-llvm 93848b8605Smrg</pre> 94848b8605Smrg 95848b8605Smrgbut the rest of these instructions assume that scons is used. 96848b8605Smrg 97848b8605SmrgFor Windows the procedure is similar except the target: 98848b8605Smrg 99848b8605Smrg<pre> 100848b8605Smrg scons platform=windows build=debug libgl-gdi 101848b8605Smrg</pre> 102848b8605Smrg 103848b8605Smrg 104848b8605Smrg<h1>Using</h1> 105848b8605Smrg 106848b8605Smrg<h2>Linux</h2> 107848b8605Smrg 108848b8605Smrg<p>On Linux, building will create a drop-in alternative for libGL.so into</p> 109848b8605Smrg 110848b8605Smrg<pre> 111848b8605Smrg build/foo/gallium/targets/libgl-xlib/libGL.so 112848b8605Smrg</pre> 113848b8605Smrgor 114848b8605Smrg<pre> 115848b8605Smrg lib/gallium/libGL.so 116848b8605Smrg</pre> 117848b8605Smrg 118848b8605Smrg<p>To use it set the LD_LIBRARY_PATH environment variable accordingly.</p> 119848b8605Smrg 120848b8605Smrg<p>For performance evaluation pass build=release to scons, and use the corresponding 121848b8605Smrglib directory without the "-debug" suffix.</p> 122848b8605Smrg 123848b8605Smrg 124848b8605Smrg<h2>Windows</h2> 125848b8605Smrg 126848b8605Smrg<p> 127848b8605SmrgOn Windows, building will create 128848b8605Smrg<code>build/windows-x86-debug/gallium/targets/libgl-gdi/opengl32.dll</code> 129848b8605Smrgwhich is a drop-in alternative for system's <code>opengl32.dll</code>. To use 130848b8605Smrgit put it in the same directory as your application. It can also be used by 131848b8605Smrgreplacing the native ICD driver, but it's quite an advanced usage, so if you 132848b8605Smrgneed to ask, don't even try it. 133848b8605Smrg</p> 134848b8605Smrg 135848b8605Smrg<p> 136848b8605SmrgThere is however an easy way to replace the OpenGL software renderer that comes 137848b8605Smrgwith Microsoft Windows 7 (or later) with llvmpipe (that is, on systems without 138848b8605Smrgany OpenGL drivers): 139848b8605Smrg</p> 140848b8605Smrg 141848b8605Smrg<ul> 142848b8605Smrg <li><p>copy build/windows-x86-debug/gallium/targets/libgl-gdi/opengl32.dll to C:\Windows\SysWOW64\mesadrv.dll</p></li> 143848b8605Smrg <li><p>load this registry settings:</p> 144848b8605Smrg <pre>REGEDIT4 145848b8605Smrg 146848b8605Smrg; http://technet.microsoft.com/en-us/library/cc749368.aspx 147848b8605Smrg; http://www.msfn.org/board/topic/143241-portable-windows-7-build-from-winpe-30/page-5#entry942596 148848b8605Smrg[HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Windows NT\CurrentVersion\OpenGLDrivers\MSOGL] 149848b8605Smrg"DLL"="mesadrv.dll" 150848b8605Smrg"DriverVersion"=dword:00000001 151848b8605Smrg"Flags"=dword:00000001 152848b8605Smrg"Version"=dword:00000002 153848b8605Smrg</pre> 154848b8605Smrg </li> 155848b8605Smrg <li>Ditto for 64 bits drivers if you need them.</li> 156848b8605Smrg</ul> 157848b8605Smrg 158848b8605Smrg 159848b8605Smrg<h1>Profiling</h1> 160848b8605Smrg 161848b8605Smrg<p> 162848b8605SmrgTo profile llvmpipe you should build as 163848b8605Smrg</p> 164848b8605Smrg<pre> 165848b8605Smrg scons build=profile <same-as-before> 166848b8605Smrg</pre> 167848b8605Smrg 168848b8605Smrg<p> 169848b8605SmrgThis will ensure that frame pointers are used both in C and JIT functions, and 170848b8605Smrgthat no tail call optimizations are done by gcc. 171848b8605Smrg</p> 172848b8605Smrg 173848b8605Smrg<h2>Linux perf integration</h2> 174848b8605Smrg 175848b8605Smrg<p> 176848b8605SmrgOn Linux, it is possible to have symbol resolution of JIT code with <a href="http://perf.wiki.kernel.org/">Linux perf</a>: 177848b8605Smrg</p> 178848b8605Smrg 179848b8605Smrg<pre> 180848b8605Smrg perf record -g /my/application 181848b8605Smrg perf report 182848b8605Smrg</pre> 183848b8605Smrg 184848b8605Smrg<p> 185848b8605SmrgWhen run inside Linux perf, llvmpipe will create a /tmp/perf-XXXXX.map file with 186848b8605Smrgsymbol address table. It also dumps assembly code to /tmp/perf-XXXXX.map.asm, 187848b8605Smrgwhich can be used by the bin/perf-annotate-jit script to produce disassembly of 188848b8605Smrgthe generated code annotated with the samples. 189848b8605Smrg</p> 190848b8605Smrg 191848b8605Smrg<p>You can obtain a call graph via 192848b8605Smrg<a href="http://code.google.com/p/jrfonseca/wiki/Gprof2Dot#linux_perf">Gprof2Dot</a>.</p> 193848b8605Smrg 194848b8605Smrg 195848b8605Smrg<h1>Unit testing</h1> 196848b8605Smrg 197848b8605Smrg<p> 198848b8605SmrgBuilding will also create several unit tests in 199848b8605Smrgbuild/linux-???-debug/gallium/drivers/llvmpipe: 200848b8605Smrg</p> 201848b8605Smrg 202848b8605Smrg<ul> 203848b8605Smrg<li> lp_test_blend: blending 204848b8605Smrg<li> lp_test_conv: SIMD vector conversion 205848b8605Smrg<li> lp_test_format: pixel unpacking/packing 206848b8605Smrg</ul> 207848b8605Smrg 208848b8605Smrg<p> 209848b8605SmrgSome of this tests can output results and benchmarks to a tab-separated-file 210848b8605Smrgfor posterior analysis, e.g.: 211848b8605Smrg</p> 212848b8605Smrg<pre> 213848b8605Smrg build/linux-x86_64-debug/gallium/drivers/llvmpipe/lp_test_blend -o blend.tsv 214848b8605Smrg</pre> 215848b8605Smrg 216848b8605Smrg 217848b8605Smrg<h1>Development Notes</h1> 218848b8605Smrg 219848b8605Smrg<ul> 220848b8605Smrg<li> 221848b8605Smrg When looking to this code by the first time start in lp_state_fs.c, and 222848b8605Smrg then skim through the lp_bld_* functions called in there, and the comments 223848b8605Smrg at the top of the lp_bld_*.c functions. 224848b8605Smrg</li> 225848b8605Smrg<li> 226848b8605Smrg The driver-independent parts of the LLVM / Gallium code are found in 227848b8605Smrg src/gallium/auxiliary/gallivm/. The filenames and function prefixes 228848b8605Smrg need to be renamed from "lp_bld_" to something else though. 229848b8605Smrg</li> 230848b8605Smrg<li> 231848b8605Smrg We use LLVM-C bindings for now. They are not documented, but follow the C++ 232848b8605Smrg interfaces very closely, and appear to be complete enough for code 233848b8605Smrg generation. See 234848b8605Smrg <a href="http://npcontemplation.blogspot.com/2008/06/secret-of-llvm-c-bindings.html"> 235848b8605Smrg this stand-alone example</a>. See the llvm-c/Core.h file for reference. 236848b8605Smrg</li> 237848b8605Smrg</ul> 238848b8605Smrg 239848b8605Smrg<h1 id="recommended_reading">Recommended Reading</h1> 240848b8605Smrg 241848b8605Smrg<ul> 242848b8605Smrg <li> 243848b8605Smrg <p>Rasterization</p> 244848b8605Smrg <ul> 245848b8605Smrg <li><a href="http://www.cs.unc.edu/~olano/papers/2dh-tri/">Triangle Scan Conversion using 2D Homogeneous Coordinates</a></li> 246848b8605Smrg <li><a href="http://www.drdobbs.com/parallel/rasterization-on-larrabee/217200602">Rasterization on Larrabee</a> (<a href="http://devmaster.net/posts/2887/rasterization-on-larrabee">DevMaster copy</a>)</li> 247848b8605Smrg <li><a href="http://devmaster.net/posts/6133/rasterization-using-half-space-functions">Rasterization using half-space functions</a></li> 248848b8605Smrg <li><a href="http://devmaster.net/posts/6145/advanced-rasterization">Advanced Rasterization</a></li> 249848b8605Smrg <li><a href="http://fgiesen.wordpress.com/2013/02/17/optimizing-sw-occlusion-culling-index/">Optimizing Software Occlusion Culling</a></li> 250848b8605Smrg </ul> 251848b8605Smrg </li> 252848b8605Smrg <li> 253848b8605Smrg <p>Texture sampling</p> 254848b8605Smrg <ul> 255848b8605Smrg <li><a href="http://chrishecker.com/Miscellaneous_Technical_Articles#Perspective_Texture_Mapping">Perspective Texture Mapping</a></li> 256848b8605Smrg <li><a href="http://www.flipcode.com/archives/Texturing_As_In_Unreal.shtml">Texturing As In Unreal</a></li> 257848b8605Smrg <li><a href="http://www.gamasutra.com/view/feature/3301/runtime_mipmap_filtering.php">Run-Time MIP-Map Filtering</a></li> 258848b8605Smrg <li><a href="http://alt.3dcenter.org/artikel/2003/10-26_a_english.php">Will "brilinear" filtering persist?</a></li> 259848b8605Smrg <li><a href="http://ixbtlabs.com/articles2/gffx/nv40-rx800-3.html">Trilinear filtering</a></li> 260848b8605Smrg <li><a href="http://devmaster.net/posts/12785/texture-swizzling">Texture Swizzling</a></li> 261848b8605Smrg </ul> 262848b8605Smrg </li> 263848b8605Smrg <li> 264848b8605Smrg <p>SIMD</p> 265848b8605Smrg <ul> 266848b8605Smrg <li><a href="http://www.cdl.uni-saarland.de/projects/wfv/#header4">Whole-Function Vectorization</a></li> 267848b8605Smrg </ul> 268848b8605Smrg </li> 269848b8605Smrg <li> 270848b8605Smrg <p>Optimization</p> 271848b8605Smrg <ul> 272848b8605Smrg <li><a href="http://www.drdobbs.com/optimizing-pixomatic-for-modern-x86-proc/184405807">Optimizing Pixomatic For Modern x86 Processors</a></li> 273848b8605Smrg <li><a href="http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html">Intel 64 and IA-32 Architectures Optimization Reference Manual</a></li> 274848b8605Smrg <li><a href="http://www.agner.org/optimize/">Software optimization resources</a></li> 275848b8605Smrg <li><a href="http://software.intel.com/en-us/articles/intel-intrinsics-guide">Intel Intrinsics Guide</a><li> 276848b8605Smrg </ul> 277848b8605Smrg </li> 278848b8605Smrg <li> 279848b8605Smrg <p>LLVM</p> 280848b8605Smrg <ul> 281848b8605Smrg <li><a href="http://llvm.org/docs/LangRef.html">LLVM Language Reference Manual</a></li> 282848b8605Smrg <li><a href="http://npcontemplation.blogspot.co.uk/2008/06/secret-of-llvm-c-bindings.html">The secret of LLVM C bindings</a></li> 283848b8605Smrg </ul> 284848b8605Smrg </li> 285848b8605Smrg <li> 286848b8605Smrg <p>General</p> 287848b8605Smrg <ul> 288848b8605Smrg <li><a href="http://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/">A trip through the Graphics Pipeline</a></li> 289848b8605Smrg <li><a href="http://msdn.microsoft.com/en-us/library/gg615082.aspx#architecture">WARP Architecture and Performance</a></li> 290848b8605Smrg </ul> 291848b8605Smrg </li> 292848b8605Smrg</ul> 293848b8605Smrg 294848b8605Smrg</div> 295848b8605Smrg</body> 296848b8605Smrg</html> 297