1 .. Copyright (C) 2015-2022 Free Software Foundation, Inc. 2 Originally contributed by David Malcolm <dmalcolm (a] redhat.com> 3 4 This is free software: you can redistribute it and/or modify it 5 under the terms of the GNU General Public License as published by 6 the Free Software Foundation, either version 3 of the License, or 7 (at your option) any later version. 8 9 This program is distributed in the hope that it will be useful, but 10 WITHOUT ANY WARRANTY; without even the implied warranty of 11 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU 12 General Public License for more details. 13 14 You should have received a copy of the GNU General Public License 15 along with this program. If not, see 16 <https://www.gnu.org/licenses/>. 17 18 Tutorial part 5: Implementing an Ahead-of-Time compiler 19 ------------------------------------------------------- 20 21 If you have a pre-existing language frontend that's compatible with 22 libgccjit's license, it's possible to hook it up to libgccjit as a 23 backend. In the previous example we showed 24 how to do that for in-memory JIT-compilation, but libgccjit can also 25 compile code directly to a file, allowing you to implement a more 26 traditional ahead-of-time compiler ("JIT" is something of a misnomer 27 for this use-case). 28 29 The essential difference is to compile the context using 30 :c:func:`gcc_jit_context_compile_to_file` rather than 31 :c:func:`gcc_jit_context_compile`. 32 33 The "brainf" language 34 ********************* 35 36 In this example we use libgccjit to construct an ahead-of-time compiler 37 for an esoteric programming language that we shall refer to as "brainf". 38 39 brainf scripts operate on an array of bytes, with a notional data pointer 40 within the array. 41 42 brainf is hard for humans to read, but it's trivial to write a parser for 43 it, as there is no lexing; just a stream of bytes. The operations are: 44 45 ====================== ============================= 46 Character Meaning 47 ====================== ============================= 48 ``>`` ``idx += 1`` 49 ``<`` ``idx -= 1`` 50 ``+`` ``data[idx] += 1`` 51 ``-`` ``data[idx] -= 1`` 52 ``.`` ``output (data[idx])`` 53 ``,`` ``data[idx] = input ()`` 54 ``[`` loop until ``data[idx] == 0`` 55 ``]`` end of loop 56 Anything else ignored 57 ====================== ============================= 58 59 Unlike the previous example, we'll implement an ahead-of-time compiler, 60 which reads ``.bf`` scripts and outputs executables (though it would 61 be trivial to have it run them JIT-compiled in-process). 62 63 Here's what a simple ``.bf`` script looks like: 64 65 .. literalinclude:: ../examples/emit-alphabet.bf 66 :lines: 1- 67 68 .. note:: 69 70 This example makes use of whitespace and comments for legibility, but 71 could have been written as:: 72 73 ++++++++++++++++++++++++++ 74 >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++< 75 [>.+<-] 76 77 It's not a particularly useful language, except for providing 78 compiler-writers with a test case that's easy to parse. The point 79 is that you can use :c:func:`gcc_jit_context_compile_to_file` 80 to use libgccjit as a backend for a pre-existing language frontend 81 (provided that the pre-existing frontend is compatible with libgccjit's 82 license). 83 84 Converting a brainf script to libgccjit IR 85 ****************************************** 86 87 As before we write simple code to populate a :c:type:`gcc_jit_context *`. 88 89 .. literalinclude:: ../examples/tut05-bf.c 90 :start-after: #define MAX_OPEN_PARENS 16 91 :end-before: /* Entrypoint to the compiler. */ 92 :language: c 93 94 Compiling a context to a file 95 ***************************** 96 97 Unlike the previous tutorial, this time we'll compile the context 98 directly to an executable, using :c:func:`gcc_jit_context_compile_to_file`: 99 100 .. code-block:: c 101 102 gcc_jit_context_compile_to_file (ctxt, 103 GCC_JIT_OUTPUT_KIND_EXECUTABLE, 104 output_file); 105 106 Here's the top-level of the compiler, which is what actually calls into 107 :c:func:`gcc_jit_context_compile_to_file`: 108 109 .. literalinclude:: ../examples/tut05-bf.c 110 :start-after: /* Entrypoint to the compiler. */ 111 :end-before: /* Use the built compiler to compile the example to an executable: 112 :language: c 113 114 Note how once the context is populated you could trivially instead compile 115 it to memory using :c:func:`gcc_jit_context_compile` and run it in-process 116 as in the previous tutorial. 117 118 To create an executable, we need to export a ``main`` function. Here's 119 how to create one from the JIT API: 120 121 .. literalinclude:: ../examples/tut05-bf.c 122 :start-after: #include "libgccjit.h" 123 :end-before: #define MAX_OPEN_PARENS 16 124 :language: c 125 126 .. note:: 127 128 The above implementation ignores ``argc`` and ``argv``, but you could 129 make use of them by exposing ``param_argc`` and ``param_argv`` to the 130 caller. 131 132 Upon compiling this C code, we obtain a bf-to-machine-code compiler; 133 let's call it ``bfc``: 134 135 .. code-block:: console 136 137 $ gcc \ 138 tut05-bf.c \ 139 -o bfc \ 140 -lgccjit 141 142 We can now use ``bfc`` to compile .bf files into machine code executables: 143 144 .. code-block:: console 145 146 $ ./bfc \ 147 emit-alphabet.bf \ 148 a.out 149 150 which we can run directly: 151 152 .. code-block:: console 153 154 $ ./a.out 155 ABCDEFGHIJKLMNOPQRSTUVWXYZ 156 157 Success! 158 159 We can also inspect the generated executable using standard tools: 160 161 .. code-block:: console 162 163 $ objdump -d a.out |less 164 165 which shows that libgccjit has managed to optimize the function 166 somewhat (for example, the runs of 26 and 65 increment operations 167 have become integer constants 0x1a and 0x41): 168 169 .. code-block:: console 170 171 0000000000400620 <main>: 172 400620: 80 3d 39 0a 20 00 00 cmpb $0x0,0x200a39(%rip) # 601060 <data 173 400627: 74 07 je 400630 <main 174 400629: eb fe jmp 400629 <main+0x9> 175 40062b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) 176 400630: 48 83 ec 08 sub $0x8,%rsp 177 400634: 0f b6 05 26 0a 20 00 movzbl 0x200a26(%rip),%eax # 601061 <data_cells+0x1> 178 40063b: c6 05 1e 0a 20 00 1a movb $0x1a,0x200a1e(%rip) # 601060 <data_cells> 179 400642: 8d 78 41 lea 0x41(%rax),%edi 180 400645: 40 88 3d 15 0a 20 00 mov %dil,0x200a15(%rip) # 601061 <data_cells+0x1> 181 40064c: 0f 1f 40 00 nopl 0x0(%rax) 182 400650: 40 0f b6 ff movzbl %dil,%edi 183 400654: e8 87 fe ff ff callq 4004e0 <putchar@plt> 184 400659: 0f b6 05 01 0a 20 00 movzbl 0x200a01(%rip),%eax # 601061 <data_cells+0x1> 185 400660: 80 2d f9 09 20 00 01 subb $0x1,0x2009f9(%rip) # 601060 <data_cells> 186 400667: 8d 78 01 lea 0x1(%rax),%edi 187 40066a: 40 88 3d f0 09 20 00 mov %dil,0x2009f0(%rip) # 601061 <data_cells+0x1> 188 400671: 75 dd jne 400650 <main+0x30> 189 400673: 31 c0 xor %eax,%eax 190 400675: 48 83 c4 08 add $0x8,%rsp 191 400679: c3 retq 192 40067a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) 193 194 We also set up debugging information (via 195 :c:func:`gcc_jit_context_new_location` and 196 :c:macro:`GCC_JIT_BOOL_OPTION_DEBUGINFO`), so it's possible to use ``gdb`` 197 to singlestep through the generated binary and inspect the internal 198 state ``idx`` and ``data_cells``: 199 200 .. code-block:: console 201 202 (gdb) break main 203 Breakpoint 1 at 0x400790 204 (gdb) run 205 Starting program: a.out 206 207 Breakpoint 1, 0x0000000000400790 in main (argc=1, argv=0x7fffffffe448) 208 (gdb) stepi 209 0x0000000000400797 in main (argc=1, argv=0x7fffffffe448) 210 (gdb) stepi 211 0x00000000004007a0 in main (argc=1, argv=0x7fffffffe448) 212 (gdb) stepi 213 9 >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++< 214 (gdb) list 215 4 216 5 cell 0 = 26 217 6 ++++++++++++++++++++++++++ 218 7 219 8 cell 1 = 65 220 9 >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++< 221 10 222 11 while cell#0 != 0 223 12 [ 224 13 > 225 (gdb) n 226 6 ++++++++++++++++++++++++++ 227 (gdb) n 228 9 >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++< 229 (gdb) p idx 230 $1 = 1 231 (gdb) p data_cells 232 $2 = "\032", '\000' <repeats 29998 times> 233 (gdb) p data_cells[0] 234 $3 = 26 '\032' 235 (gdb) p data_cells[1] 236 $4 = 0 '\000' 237 (gdb) list 238 4 239 5 cell 0 = 26 240 6 ++++++++++++++++++++++++++ 241 7 242 8 cell 1 = 65 243 9 >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++< 244 10 245 11 while cell#0 != 0 246 12 [ 247 13 > 248 249 250 Other forms of ahead-of-time-compilation 251 **************************************** 252 253 The above demonstrates compiling a :c:type:`gcc_jit_context *` directly 254 to an executable. It's also possible to compile it to an object file, 255 and to a dynamic library. See the documentation of 256 :c:func:`gcc_jit_context_compile_to_file` for more information. 257