1 1.1 joerg ======= 2 1.1 joerg ThinLTO 3 1.1 joerg ======= 4 1.1 joerg 5 1.1 joerg .. contents:: 6 1.1 joerg :local: 7 1.1 joerg 8 1.1 joerg Introduction 9 1.1 joerg ============ 10 1.1 joerg 11 1.1 joerg *ThinLTO* compilation is a new type of LTO that is both scalable and 12 1.1 joerg incremental. *LTO* (Link Time Optimization) achieves better 13 1.1 joerg runtime performance through whole-program analysis and cross-module 14 1.1 joerg optimization. However, monolithic LTO implements this by merging all 15 1.1 joerg input into a single module, which is not scalable 16 1.1 joerg in time or memory, and also prevents fast incremental compiles. 17 1.1 joerg 18 1.1 joerg In ThinLTO mode, as with regular LTO, clang emits LLVM bitcode after the 19 1.1 joerg compile phase. The ThinLTO bitcode is augmented with a compact summary 20 1.1 joerg of the module. During the link step, only the summaries are read and 21 1.1 joerg merged into a combined summary index, which includes an index of function 22 1.1 joerg locations for later cross-module function importing. Fast and efficient 23 1.1 joerg whole-program analysis is then performed on the combined summary index. 24 1.1 joerg 25 1.1 joerg However, all transformations, including function importing, occur 26 1.1 joerg later when the modules are optimized in fully parallel backends. 27 1.1 joerg By default, linkers_ that support ThinLTO are set up to launch 28 1.1 joerg the ThinLTO backends in threads. So the usage model is not affected 29 1.1 joerg as the distinction between the fast serial thin link step and the backends 30 1.1 joerg is transparent to the user. 31 1.1 joerg 32 1.1 joerg For more information on the ThinLTO design and current performance, 33 1.1 joerg see the LLVM blog post `ThinLTO: Scalable and Incremental LTO 34 1.1 joerg <http://blog.llvm.org/2016/06/thinlto-scalable-and-incremental-lto.html>`_. 35 1.1 joerg While tuning is still in progress, results in the blog post show that 36 1.1 joerg ThinLTO already performs well compared to LTO, in many cases matching 37 1.1 joerg the performance improvement. 38 1.1 joerg 39 1.1 joerg Current Status 40 1.1 joerg ============== 41 1.1 joerg 42 1.1 joerg Clang/LLVM 43 1.1 joerg ---------- 44 1.1 joerg .. _compiler: 45 1.1 joerg 46 1.1 joerg The 3.9 release of clang includes ThinLTO support. However, ThinLTO 47 1.1 joerg is under active development, and new features, improvements and bugfixes 48 1.1 joerg are being added for the next release. For the latest ThinLTO support, 49 1.1 joerg `build a recent version of clang and LLVM 50 1.1 joerg <https://llvm.org/docs/CMake.html>`_. 51 1.1 joerg 52 1.1 joerg Linkers 53 1.1 joerg ------- 54 1.1 joerg .. _linkers: 55 1.1 joerg .. _linker: 56 1.1 joerg 57 1.1 joerg ThinLTO is currently supported for the following linkers: 58 1.1 joerg 59 1.1 joerg - **gold (via the gold-plugin)**: 60 1.1 joerg Similar to monolithic LTO, this requires using 61 1.1 joerg a `gold linker configured with plugins enabled 62 1.1 joerg <https://llvm.org/docs/GoldPlugin.html>`_. 63 1.1 joerg - **ld64**: 64 1.1 joerg Starting with `Xcode 8 <https://developer.apple.com/xcode/>`_. 65 1.1 joerg - **lld**: 66 1.1 joerg Starting with r284050 for ELF, r298942 for COFF. 67 1.1 joerg 68 1.1 joerg Usage 69 1.1 joerg ===== 70 1.1 joerg 71 1.1 joerg Basic 72 1.1 joerg ----- 73 1.1 joerg 74 1.1 joerg To utilize ThinLTO, simply add the -flto=thin option to compile and link. E.g. 75 1.1 joerg 76 1.1 joerg .. code-block:: console 77 1.1 joerg 78 1.1 joerg % clang -flto=thin -O2 file1.c file2.c -c 79 1.1 joerg % clang -flto=thin -O2 file1.o file2.o -o a.out 80 1.1 joerg 81 1.1 joerg When using lld-link, the -flto option need only be added to the compile step: 82 1.1 joerg 83 1.1 joerg .. code-block:: console 84 1.1 joerg 85 1.1 joerg % clang-cl -flto=thin -O2 -c file1.c file2.c 86 1.1 joerg % lld-link /out:a.exe file1.obj file2.obj 87 1.1 joerg 88 1.1 joerg As mentioned earlier, by default the linkers will launch the ThinLTO backend 89 1.1 joerg threads in parallel, passing the resulting native object files back to the 90 1.1 joerg linker for the final native link. As such, the usage model the same as 91 1.1 joerg non-LTO. 92 1.1 joerg 93 1.1 joerg With gold, if you see an error during the link of the form: 94 1.1 joerg 95 1.1 joerg .. code-block:: console 96 1.1 joerg 97 1.1 joerg /usr/bin/ld: error: /path/to/clang/bin/../lib/LLVMgold.so: could not load plugin library: /path/to/clang/bin/../lib/LLVMgold.so: cannot open shared object file: No such file or directory 98 1.1 joerg 99 1.1 joerg Then either gold was not configured with plugins enabled, or clang 100 1.1 joerg was not built with ``-DLLVM_BINUTILS_INCDIR`` set properly. See 101 1.1 joerg the instructions for the 102 1.1 joerg `LLVM gold plugin <https://llvm.org/docs/GoldPlugin.html#how-to-build-it>`_. 103 1.1 joerg 104 1.1 joerg Controlling Backend Parallelism 105 1.1 joerg ------------------------------- 106 1.1 joerg .. _parallelism: 107 1.1 joerg 108 1.1 joerg By default, the ThinLTO link step will launch as many 109 1.1 joerg threads in parallel as there are cores. If the number of 110 1.1 joerg cores can't be computed for the architecture, then it will launch 111 1.1 joerg ``std::thread::hardware_concurrency`` number of threads in parallel. 112 1.1 joerg For machines with hyper-threading, this is the total number of 113 1.1 joerg virtual cores. For some applications and machine configurations this 114 1.1 joerg may be too aggressive, in which case the amount of parallelism can 115 1.1 joerg be reduced to ``N`` via: 116 1.1 joerg 117 1.1 joerg - gold: 118 1.1 joerg ``-Wl,-plugin-opt,jobs=N`` 119 1.1 joerg - ld64: 120 1.1 joerg ``-Wl,-mllvm,-threads=N`` 121 1.1 joerg - lld: 122 1.1 joerg ``-Wl,--thinlto-jobs=N`` 123 1.1 joerg - lld-link: 124 1.1 joerg ``/opt:lldltojobs=N`` 125 1.1 joerg 126 1.1.1.2 joerg Other possible values for ``N`` are: 127 1.1.1.2 joerg 128 1.1.1.2 joerg - 0: 129 1.1.1.2 joerg Use one thread per physical core (default) 130 1.1.1.2 joerg - 1: 131 1.1.1.2 joerg Use a single thread only (disable multi-threading) 132 1.1.1.2 joerg - all: 133 1.1.1.2 joerg Use one thread per logical core (uses all hyper-threads) 134 1.1.1.2 joerg 135 1.1 joerg Incremental 136 1.1 joerg ----------- 137 1.1 joerg .. _incremental: 138 1.1 joerg 139 1.1 joerg ThinLTO supports fast incremental builds through the use of a cache, 140 1.1 joerg which currently must be enabled through a linker option. 141 1.1 joerg 142 1.1 joerg - gold (as of LLVM 4.0): 143 1.1 joerg ``-Wl,-plugin-opt,cache-dir=/path/to/cache`` 144 1.1 joerg - ld64 (support in clang 3.9 and Xcode 8): 145 1.1 joerg ``-Wl,-cache_path_lto,/path/to/cache`` 146 1.1 joerg - ELF lld (as of LLVM 5.0): 147 1.1 joerg ``-Wl,--thinlto-cache-dir=/path/to/cache`` 148 1.1 joerg - COFF lld-link (as of LLVM 6.0): 149 1.1 joerg ``/lldltocache:/path/to/cache`` 150 1.1 joerg 151 1.1 joerg Cache Pruning 152 1.1 joerg ------------- 153 1.1 joerg 154 1.1 joerg To help keep the size of the cache under control, ThinLTO supports cache 155 1.1 joerg pruning. Cache pruning is supported with gold, ld64 and ELF and COFF lld, but 156 1.1 joerg currently only gold, ELF and COFF lld allow you to control the policy with a 157 1.1 joerg policy string. The cache policy must be specified with a linker option. 158 1.1 joerg 159 1.1 joerg - gold (as of LLVM 6.0): 160 1.1 joerg ``-Wl,-plugin-opt,cache-policy=POLICY`` 161 1.1 joerg - ELF lld (as of LLVM 5.0): 162 1.1 joerg ``-Wl,--thinlto-cache-policy,POLICY`` 163 1.1 joerg - COFF lld-link (as of LLVM 6.0): 164 1.1 joerg ``/lldltocachepolicy:POLICY`` 165 1.1 joerg 166 1.1 joerg A policy string is a series of key-value pairs separated by ``:`` characters. 167 1.1 joerg Possible key-value pairs are: 168 1.1 joerg 169 1.1 joerg - ``cache_size=X%``: The maximum size for the cache directory is ``X`` percent 170 1.1 joerg of the available space on the disk. Set to 100 to indicate no limit, 171 1.1 joerg 50 to indicate that the cache size will not be left over half the available 172 1.1 joerg disk space. A value over 100 is invalid. A value of 0 disables the percentage 173 1.1 joerg size-based pruning. The default is 75%. 174 1.1 joerg 175 1.1 joerg - ``cache_size_bytes=X``, ``cache_size_bytes=Xk``, ``cache_size_bytes=Xm``, 176 1.1 joerg ``cache_size_bytes=Xg``: 177 1.1 joerg Sets the maximum size for the cache directory to ``X`` bytes (or KB, MB, 178 1.1 joerg GB respectively). A value over the amount of available space on the disk 179 1.1 joerg will be reduced to the amount of available space. A value of 0 disables 180 1.1 joerg the byte size-based pruning. The default is no byte size-based pruning. 181 1.1 joerg 182 1.1 joerg Note that ThinLTO will apply both size-based pruning policies simultaneously, 183 1.1 joerg and changing one does not affect the other. For example, a policy of 184 1.1 joerg ``cache_size_bytes=1g`` on its own will cause both the 1GB and default 75% 185 1.1 joerg policies to be applied unless the default ``cache_size`` is overridden. 186 1.1 joerg 187 1.1 joerg - ``cache_size_files=X``: 188 1.1 joerg Set the maximum number of files in the cache directory. Set to 0 to indicate 189 1.1 joerg no limit. The default is 1000000 files. 190 1.1 joerg 191 1.1 joerg - ``prune_after=Xs``, ``prune_after=Xm``, ``prune_after=Xh``: Sets the 192 1.1 joerg expiration time for cache files to ``X`` seconds (or minutes, hours 193 1.1 joerg respectively). When a file hasn't been accessed for ``prune_after`` seconds, 194 1.1 joerg it is removed from the cache. A value of 0 disables the expiration-based 195 1.1 joerg pruning. The default is 1 week. 196 1.1 joerg 197 1.1 joerg - ``prune_interval=Xs``, ``prune_interval=Xm``, ``prune_interval=Xh``: 198 1.1 joerg Sets the pruning interval to ``X`` seconds (or minutes, hours 199 1.1 joerg respectively). This is intended to be used to avoid scanning the directory 200 1.1 joerg too often. It does not impact the decision of which files to prune. A 201 1.1 joerg value of 0 forces the scan to occur. The default is every 20 minutes. 202 1.1 joerg 203 1.1 joerg Clang Bootstrap 204 1.1 joerg --------------- 205 1.1 joerg 206 1.1.1.2 joerg To `bootstrap clang/LLVM <https://llvm.org/docs/AdvancedBuilds.html#bootstrap-builds>`_ 207 1.1.1.2 joerg with ThinLTO, follow these steps: 208 1.1 joerg 209 1.1 joerg 1. The host compiler_ must be a version of clang that supports ThinLTO. 210 1.1 joerg #. The host linker_ must support ThinLTO (and in the case of gold, must be 211 1.1 joerg `configured with plugins enabled <https://llvm.org/docs/GoldPlugin.html>`_). 212 1.1 joerg #. Use the following additional `CMake variables 213 1.1 joerg <https://llvm.org/docs/CMake.html#options-and-variables>`_ 214 1.1 joerg when configuring the bootstrap compiler build: 215 1.1 joerg 216 1.1 joerg * ``-DLLVM_ENABLE_LTO=Thin`` 217 1.1 joerg * ``-DCMAKE_C_COMPILER=/path/to/host/clang`` 218 1.1 joerg * ``-DCMAKE_CXX_COMPILER=/path/to/host/clang++`` 219 1.1 joerg * ``-DCMAKE_RANLIB=/path/to/host/llvm-ranlib`` 220 1.1 joerg * ``-DCMAKE_AR=/path/to/host/llvm-ar`` 221 1.1 joerg 222 1.1 joerg Or, on Windows: 223 1.1 joerg 224 1.1 joerg * ``-DLLVM_ENABLE_LTO=Thin`` 225 1.1 joerg * ``-DCMAKE_C_COMPILER=/path/to/host/clang-cl.exe`` 226 1.1 joerg * ``-DCMAKE_CXX_COMPILER=/path/to/host/clang-cl.exe`` 227 1.1 joerg * ``-DCMAKE_LINKER=/path/to/host/lld-link.exe`` 228 1.1 joerg * ``-DCMAKE_RANLIB=/path/to/host/llvm-ranlib.exe`` 229 1.1 joerg * ``-DCMAKE_AR=/path/to/host/llvm-ar.exe`` 230 1.1 joerg 231 1.1 joerg #. To use additional linker arguments for controlling the backend 232 1.1 joerg parallelism_ or enabling incremental_ builds of the bootstrap compiler, 233 1.1 joerg after configuring the build, modify the resulting CMakeCache.txt file in the 234 1.1 joerg build directory. Specify any additional linker options after 235 1.1 joerg ``CMAKE_EXE_LINKER_FLAGS:STRING=``. Note the configure may fail if 236 1.1 joerg linker plugin options are instead specified directly in the previous step. 237 1.1 joerg 238 1.1.1.2 joerg The ``BOOTSTRAP_LLVM_ENABLE_LTO=Thin`` will enable ThinLTO for stage 2 and 239 1.1 joerg stage 3 in case the compiler used for stage 1 does not support the ThinLTO 240 1.1 joerg option. 241 1.1 joerg 242 1.1 joerg More Information 243 1.1 joerg ================ 244 1.1 joerg 245 1.1 joerg * From LLVM project blog: 246 1.1 joerg `ThinLTO: Scalable and Incremental LTO 247 1.1 joerg <http://blog.llvm.org/2016/06/thinlto-scalable-and-incremental-lto.html>`_ 248