llvmpipe.html revision 848b8605
1848b8605Smrg<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
2848b8605Smrg<html lang="en">
3848b8605Smrg<head>
4848b8605Smrg  <meta http-equiv="content-type" content="text/html; charset=utf-8">
5848b8605Smrg  <title>llvmpipe</title>
6848b8605Smrg  <link rel="stylesheet" type="text/css" href="mesa.css">
7848b8605Smrg</head>
8848b8605Smrg<body>
9848b8605Smrg
10848b8605Smrg<div class="header">
11848b8605Smrg  <h1>The Mesa 3D Graphics Library</h1>
12848b8605Smrg</div>
13848b8605Smrg
14848b8605Smrg<iframe src="contents.html"></iframe>
15848b8605Smrg<div class="content">
16848b8605Smrg
17848b8605Smrg<h1>Introduction</h1>
18848b8605Smrg
19848b8605Smrg<p>
20848b8605SmrgThe Gallium llvmpipe driver is a software rasterizer that uses LLVM to
21848b8605Smrgdo runtime code generation.
22848b8605SmrgShaders, point/line/triangle rasterization and vertex processing are
23848b8605Smrgimplemented with LLVM IR which is translated to x86 or x86-64 machine
24848b8605Smrgcode.
25848b8605SmrgAlso, the driver is multithreaded to take advantage of multiple CPU cores
26848b8605Smrg(up to 8 at this time).
27848b8605SmrgIt's the fastest software rasterizer for Mesa.
28848b8605Smrg</p>
29848b8605Smrg
30848b8605Smrg
31848b8605Smrg<h1>Requirements</h1>
32848b8605Smrg
33848b8605Smrg<ul>
34848b8605Smrg<li>
35848b8605Smrg   <p>An x86 or amd64 processor; 64-bit mode recommended.</p>
36848b8605Smrg   <p>
37848b8605Smrg   Support for SSE2 is strongly encouraged.  Support for SSSE3 and SSE4.1 will
38848b8605Smrg   yield the most efficient code.  The fewer features the CPU has the more
39848b8605Smrg   likely is that you run into underperforming, buggy, or incomplete code.
40848b8605Smrg   </p>
41848b8605Smrg   <p>
42848b8605Smrg   See /proc/cpuinfo to know what your CPU supports.
43848b8605Smrg   </p>
44848b8605Smrg</li>
45848b8605Smrg<li>
46848b8605Smrg   <p>LLVM: version 3.4 recommended; 3.1 or later required.</p>
47848b8605Smrg   <p>
48848b8605Smrg   For Linux, on a recent Debian based distribution do:
49848b8605Smrg   </p>
50848b8605Smrg<pre>
51848b8605Smrg     aptitude install llvm-dev
52848b8605Smrg</pre>
53848b8605Smrg   <p>
54848b8605Smrg   For a RPM-based distribution do:
55848b8605Smrg   </p>
56848b8605Smrg<pre>
57848b8605Smrg     yum install llvm-devel
58848b8605Smrg</pre>
59848b8605Smrg
60848b8605Smrg   <p>
61848b8605Smrg	 For Windows you will need to build LLVM from source with MSVC or MINGW
62848b8605Smrg	 (either natively or through cross compilers) and CMake, and set the LLVM
63848b8605Smrg	 environment variable to the directory you installed it to.
64848b8605Smrg
65848b8605Smrg   LLVM will be statically linked, so when building on MSVC it needs to be
66848b8605Smrg   built with a matching CRT as Mesa, and you'll need to pass
67848b8605Smrg   -DLLVM_USE_CRT_RELEASE=MTd for debug and checked builds,
68848b8605Smrg   -DLLVM_USE_CRT_RELEASE=MTd for profile and release builds.
69848b8605Smrg
70848b8605Smrg   You can build only the x86 target by passing -DLLVM_TARGETS_TO_BUILD=X86
71848b8605Smrg   to cmake.
72848b8605Smrg   </p>
73848b8605Smrg</li>
74848b8605Smrg
75848b8605Smrg<li>
76848b8605Smrg   <p>scons (optional)</p>
77848b8605Smrg</li>
78848b8605Smrg</ul>
79848b8605Smrg
80848b8605Smrg
81848b8605Smrg<h1>Building</h1>
82848b8605Smrg
83848b8605SmrgTo build everything on Linux invoke scons as:
84848b8605Smrg
85848b8605Smrg<pre>
86848b8605Smrg  scons build=debug libgl-xlib
87848b8605Smrg</pre>
88848b8605Smrg
89848b8605SmrgAlternatively, you can build it with GNU make, if you prefer, by invoking it as
90848b8605Smrg
91848b8605Smrg<pre>
92848b8605Smrg  make linux-llvm
93848b8605Smrg</pre>
94848b8605Smrg
95848b8605Smrgbut the rest of these instructions assume that scons is used.
96848b8605Smrg
97848b8605SmrgFor Windows the procedure is similar except the target:
98848b8605Smrg
99848b8605Smrg<pre>
100848b8605Smrg  scons platform=windows build=debug libgl-gdi
101848b8605Smrg</pre>
102848b8605Smrg
103848b8605Smrg
104848b8605Smrg<h1>Using</h1>
105848b8605Smrg
106848b8605Smrg<h2>Linux</h2>
107848b8605Smrg
108848b8605Smrg<p>On Linux, building will create a drop-in alternative for libGL.so into</p>
109848b8605Smrg
110848b8605Smrg<pre>
111848b8605Smrg  build/foo/gallium/targets/libgl-xlib/libGL.so
112848b8605Smrg</pre>
113848b8605Smrgor
114848b8605Smrg<pre>
115848b8605Smrg  lib/gallium/libGL.so
116848b8605Smrg</pre>
117848b8605Smrg
118848b8605Smrg<p>To use it set the LD_LIBRARY_PATH environment variable accordingly.</p>
119848b8605Smrg
120848b8605Smrg<p>For performance evaluation pass build=release to scons, and use the corresponding
121848b8605Smrglib directory without the "-debug" suffix.</p>
122848b8605Smrg
123848b8605Smrg
124848b8605Smrg<h2>Windows</h2>
125848b8605Smrg
126848b8605Smrg<p>
127848b8605SmrgOn Windows, building will create
128848b8605Smrg<code>build/windows-x86-debug/gallium/targets/libgl-gdi/opengl32.dll</code>
129848b8605Smrgwhich is a drop-in alternative for system's <code>opengl32.dll</code>.  To use
130848b8605Smrgit put it in the same directory as your application.  It can also be used by
131848b8605Smrgreplacing the native ICD driver, but it's quite an advanced usage, so if you
132848b8605Smrgneed to ask, don't even try it.
133848b8605Smrg</p>
134848b8605Smrg
135848b8605Smrg<p>
136848b8605SmrgThere is however an easy way to replace the OpenGL software renderer that comes
137848b8605Smrgwith Microsoft Windows 7 (or later) with llvmpipe (that is, on systems without
138848b8605Smrgany OpenGL drivers):
139848b8605Smrg</p>
140848b8605Smrg
141848b8605Smrg<ul>
142848b8605Smrg  <li><p>copy build/windows-x86-debug/gallium/targets/libgl-gdi/opengl32.dll to C:\Windows\SysWOW64\mesadrv.dll</p></li>
143848b8605Smrg  <li><p>load this registry settings:</p>
144848b8605Smrg  <pre>REGEDIT4
145848b8605Smrg
146848b8605Smrg; http://technet.microsoft.com/en-us/library/cc749368.aspx
147848b8605Smrg; http://www.msfn.org/board/topic/143241-portable-windows-7-build-from-winpe-30/page-5#entry942596
148848b8605Smrg[HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Windows NT\CurrentVersion\OpenGLDrivers\MSOGL]
149848b8605Smrg"DLL"="mesadrv.dll"
150848b8605Smrg"DriverVersion"=dword:00000001
151848b8605Smrg"Flags"=dword:00000001
152848b8605Smrg"Version"=dword:00000002
153848b8605Smrg</pre>
154848b8605Smrg  </li>
155848b8605Smrg  <li>Ditto for 64 bits drivers if you need them.</li>
156848b8605Smrg</ul>
157848b8605Smrg
158848b8605Smrg
159848b8605Smrg<h1>Profiling</h1>
160848b8605Smrg
161848b8605Smrg<p>
162848b8605SmrgTo profile llvmpipe you should build as
163848b8605Smrg</p>
164848b8605Smrg<pre>
165848b8605Smrg  scons build=profile &lt;same-as-before&gt;
166848b8605Smrg</pre>
167848b8605Smrg
168848b8605Smrg<p>
169848b8605SmrgThis will ensure that frame pointers are used both in C and JIT functions, and
170848b8605Smrgthat no tail call optimizations are done by gcc.
171848b8605Smrg</p>
172848b8605Smrg
173848b8605Smrg<h2>Linux perf integration</h2>
174848b8605Smrg
175848b8605Smrg<p>
176848b8605SmrgOn Linux, it is possible to have symbol resolution of JIT code with <a href="http://perf.wiki.kernel.org/">Linux perf</a>:
177848b8605Smrg</p>
178848b8605Smrg
179848b8605Smrg<pre>
180848b8605Smrg	perf record -g /my/application
181848b8605Smrg	perf report
182848b8605Smrg</pre>
183848b8605Smrg
184848b8605Smrg<p>
185848b8605SmrgWhen run inside Linux perf, llvmpipe will create a /tmp/perf-XXXXX.map file with
186848b8605Smrgsymbol address table.  It also dumps assembly code to /tmp/perf-XXXXX.map.asm,
187848b8605Smrgwhich can be used by the bin/perf-annotate-jit script to produce disassembly of
188848b8605Smrgthe generated code annotated with the samples.
189848b8605Smrg</p>
190848b8605Smrg
191848b8605Smrg<p>You can obtain a call graph via
192848b8605Smrg<a href="http://code.google.com/p/jrfonseca/wiki/Gprof2Dot#linux_perf">Gprof2Dot</a>.</p>
193848b8605Smrg
194848b8605Smrg
195848b8605Smrg<h1>Unit testing</h1>
196848b8605Smrg
197848b8605Smrg<p>
198848b8605SmrgBuilding will also create several unit tests in
199848b8605Smrgbuild/linux-???-debug/gallium/drivers/llvmpipe:
200848b8605Smrg</p>
201848b8605Smrg
202848b8605Smrg<ul>
203848b8605Smrg<li> lp_test_blend: blending
204848b8605Smrg<li> lp_test_conv: SIMD vector conversion
205848b8605Smrg<li> lp_test_format: pixel unpacking/packing
206848b8605Smrg</ul>
207848b8605Smrg
208848b8605Smrg<p>
209848b8605SmrgSome of this tests can output results and benchmarks to a tab-separated-file
210848b8605Smrgfor posterior analysis, e.g.:
211848b8605Smrg</p>
212848b8605Smrg<pre>
213848b8605Smrg  build/linux-x86_64-debug/gallium/drivers/llvmpipe/lp_test_blend -o blend.tsv
214848b8605Smrg</pre>
215848b8605Smrg
216848b8605Smrg
217848b8605Smrg<h1>Development Notes</h1>
218848b8605Smrg
219848b8605Smrg<ul>
220848b8605Smrg<li>
221848b8605Smrg  When looking to this code by the first time start in lp_state_fs.c, and 
222848b8605Smrg  then skim through the lp_bld_* functions called in there, and the comments
223848b8605Smrg  at the top of the lp_bld_*.c functions.
224848b8605Smrg</li>
225848b8605Smrg<li>
226848b8605Smrg  The driver-independent parts of the LLVM / Gallium code are found in
227848b8605Smrg  src/gallium/auxiliary/gallivm/.  The filenames and function prefixes
228848b8605Smrg  need to be renamed from "lp_bld_" to something else though.
229848b8605Smrg</li>
230848b8605Smrg<li>
231848b8605Smrg  We use LLVM-C bindings for now. They are not documented, but follow the C++
232848b8605Smrg  interfaces very closely, and appear to be complete enough for code
233848b8605Smrg  generation. See 
234848b8605Smrg  <a href="http://npcontemplation.blogspot.com/2008/06/secret-of-llvm-c-bindings.html">
235848b8605Smrg  this stand-alone example</a>.  See the llvm-c/Core.h file for reference.
236848b8605Smrg</li>
237848b8605Smrg</ul>
238848b8605Smrg
239848b8605Smrg<h1 id="recommended_reading">Recommended Reading</h1>
240848b8605Smrg
241848b8605Smrg<ul>
242848b8605Smrg  <li>
243848b8605Smrg    <p>Rasterization</p>
244848b8605Smrg    <ul>
245848b8605Smrg      <li><a href="http://www.cs.unc.edu/~olano/papers/2dh-tri/">Triangle Scan Conversion using 2D Homogeneous Coordinates</a></li>
246848b8605Smrg      <li><a href="http://www.drdobbs.com/parallel/rasterization-on-larrabee/217200602">Rasterization on Larrabee</a> (<a href="http://devmaster.net/posts/2887/rasterization-on-larrabee">DevMaster copy</a>)</li>
247848b8605Smrg      <li><a href="http://devmaster.net/posts/6133/rasterization-using-half-space-functions">Rasterization using half-space functions</a></li>
248848b8605Smrg      <li><a href="http://devmaster.net/posts/6145/advanced-rasterization">Advanced Rasterization</a></li>
249848b8605Smrg      <li><a href="http://fgiesen.wordpress.com/2013/02/17/optimizing-sw-occlusion-culling-index/">Optimizing Software Occlusion Culling</a></li>
250848b8605Smrg    </ul>
251848b8605Smrg  </li>
252848b8605Smrg  <li>
253848b8605Smrg    <p>Texture sampling</p>
254848b8605Smrg    <ul>
255848b8605Smrg      <li><a href="http://chrishecker.com/Miscellaneous_Technical_Articles#Perspective_Texture_Mapping">Perspective Texture Mapping</a></li>
256848b8605Smrg      <li><a href="http://www.flipcode.com/archives/Texturing_As_In_Unreal.shtml">Texturing As In Unreal</a></li>
257848b8605Smrg      <li><a href="http://www.gamasutra.com/view/feature/3301/runtime_mipmap_filtering.php">Run-Time MIP-Map Filtering</a></li>
258848b8605Smrg      <li><a href="http://alt.3dcenter.org/artikel/2003/10-26_a_english.php">Will "brilinear" filtering persist?</a></li>
259848b8605Smrg      <li><a href="http://ixbtlabs.com/articles2/gffx/nv40-rx800-3.html">Trilinear filtering</a></li>
260848b8605Smrg      <li><a href="http://devmaster.net/posts/12785/texture-swizzling">Texture Swizzling</a></li>
261848b8605Smrg    </ul>
262848b8605Smrg  </li>
263848b8605Smrg  <li>
264848b8605Smrg    <p>SIMD</p>
265848b8605Smrg    <ul>
266848b8605Smrg      <li><a href="http://www.cdl.uni-saarland.de/projects/wfv/#header4">Whole-Function Vectorization</a></li>
267848b8605Smrg    </ul>
268848b8605Smrg  </li>
269848b8605Smrg  <li>
270848b8605Smrg    <p>Optimization</p>
271848b8605Smrg    <ul>
272848b8605Smrg      <li><a href="http://www.drdobbs.com/optimizing-pixomatic-for-modern-x86-proc/184405807">Optimizing Pixomatic For Modern x86 Processors</a></li>
273848b8605Smrg      <li><a href="http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html">Intel 64 and IA-32 Architectures Optimization Reference Manual</a></li>
274848b8605Smrg      <li><a href="http://www.agner.org/optimize/">Software optimization resources</a></li>
275848b8605Smrg      <li><a href="http://software.intel.com/en-us/articles/intel-intrinsics-guide">Intel Intrinsics Guide</a><li>
276848b8605Smrg    </ul>
277848b8605Smrg  </li>
278848b8605Smrg  <li>
279848b8605Smrg    <p>LLVM</p>
280848b8605Smrg    <ul>
281848b8605Smrg      <li><a href="http://llvm.org/docs/LangRef.html">LLVM Language Reference Manual</a></li>
282848b8605Smrg      <li><a href="http://npcontemplation.blogspot.co.uk/2008/06/secret-of-llvm-c-bindings.html">The secret of LLVM C bindings</a></li>
283848b8605Smrg    </ul>
284848b8605Smrg  </li>
285848b8605Smrg  <li>
286848b8605Smrg    <p>General</p>
287848b8605Smrg    <ul>
288848b8605Smrg      <li><a href="http://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/">A trip through the Graphics Pipeline</a></li>
289848b8605Smrg      <li><a href="http://msdn.microsoft.com/en-us/library/gg615082.aspx#architecture">WARP Architecture and Performance</a></li>
290848b8605Smrg    </ul>
291848b8605Smrg  </li>
292848b8605Smrg</ul>
293848b8605Smrg
294848b8605Smrg</div>
295848b8605Smrg</body>
296848b8605Smrg</html>
297