1 The following text is a brief overview of those key 2 principles which are useful to know when generating code 3 with SLJIT. Further details can be found in sljitLir.h. 4 5 ---------------------------------------------------------------- 6 What is SLJIT? 7 ---------------------------------------------------------------- 8 9 SLJIT is a platform independent assembler which 10 - provides access to common CPU features 11 - can be easily ported to wide-spread CPU 12 architectures (e.g. x86, ARM, POWER, MIPS, SPARC) 13 14 The key challenge of this project is finding a common 15 subset of CPU features which 16 - covers traditional assembly level programming 17 - can be translated to machine code efficiently 18 19 This aim is achieved by selecting those instructions / CPU 20 features which are either available on all platforms or 21 simulating them has a low performance overhead. 22 23 For example, some SLJIT instructions support base register 24 pre-update when [base+offs] memory accessing mode is used. 25 Although this feature is only available on ARM and POWER 26 CPUs, the simulation overhead is low on other CPUs. 27 28 ---------------------------------------------------------------- 29 The generic CPU model of SLJIT 30 ---------------------------------------------------------------- 31 32 The CPU has 33 - integer registers, which can store either an 34 int32_t (4 byte) or intptr_t (4 or 8 byte) value 35 - floating point registers, which can store either a 36 single (4 byte) or double (8 byte) precision value 37 - boolean status flags 38 39 *** Integer registers: 40 41 The most important rule is: when a source operand of 42 an instruction is a register, the data type of the 43 register must match the data type expected by an 44 instruction. 45 46 For example, the following code snippet 47 is a valid instruction sequence: 48 49 sljit_emit_op1(compiler, SLJIT_IMOV, 50 SLJIT_R0, 0, SLJIT_MEM1(SLJIT_R1), 0); 51 // An int32_t value is loaded into SLJIT_R0 52 sljit_emit_op1(compiler, SLJIT_INEG, 53 SLJIT_R0, 0, SLJIT_R0, 0); 54 // the int32_t value in SLJIT_R0 is negated 55 // and the type of the result is still int32_t 56 57 The next code snippet is not allowed: 58 59 sljit_emit_op1(compiler, SLJIT_MOV, 60 SLJIT_R0, 0, SLJIT_MEM1(SLJIT_R1), 0); 61 // An intptr_t value is loaded into SLJIT_R0 62 sljit_emit_op1(compiler, SLJIT_INEG, 63 SLJIT_R0, 0, SLJIT_R0, 0); 64 // The result of SLJIT_INEG instruction 65 // is undefined. Even crash is possible 66 // (e.g. on MIPS-64). 67 68 However, it is always allowed to overwrite a 69 register regardless its previous value: 70 71 sljit_emit_op1(compiler, SLJIT_MOV, 72 SLJIT_R0, 0, SLJIT_MEM1(SLJIT_R1), 0); 73 // An intptr_t value is loaded into SLJIT_R0 74 sljit_emit_op1(compiler, SLJIT_IMOV, 75 SLJIT_R0, 0, SLJIT_MEM1(SLJIT_R2), 0); 76 // From now on SLJIT_R0 contains an int32_t 77 // value. The previous value is discarded. 78 79 Type conversion instructions are provided to convert an 80 int32_t value to an intptr_t value and vice versa. In 81 certain architectures these conversions are nops (no 82 instructions are emitted). 83 84 Memory accessing: 85 86 Registers arguments of SLJIT_MEM1 / SLJIT_MEM2 addressing 87 modes must contain intptr_t data. 88 89 Signed / unsigned values: 90 91 Most operations are executed in the same way regardless 92 the value is signed or unsigned. These operations have 93 only one instruction form (e.g. SLJIT_ADD / SLJIT_MUL). 94 Instructions where the result depends on the sign have 95 two forms (e.g. integer division, long multiply). 96 97 *** Floating point registers 98 99 Floating point registers can either contain a single 100 or double precision value. Similar to integer registers, 101 the data type of the value stored in a source register 102 must match the data type expected by the instruction. 103 Otherwise the result is undefined (even crash is possible). 104 105 Rounding: 106 107 Similar to standard C, floating point computation 108 results are rounded toward zero. 109 110 *** Boolean status flags: 111 112 Conditional branches usually depend on the value 113 of CPU status flags. These status flags are boolean 114 values and can be set by certain instructions. 115 116 To achive maximum efficiency and portability, the 117 following rules were introduced: 118 - Most instructions can freely modify these status 119 flags except if SLJIT_KEEP_FLAGS is passed. 120 - The SLJIT_KEEP_FLAGS option may have a performance 121 overhead, so it should only be used when necessary. 122 - The SLJIT_SET_E, SLJIT_SET_U, etc. options can 123 force an instruction to correctly set the 124 specified status flags. However, all other 125 status flags are undefined. This rule must 126 always be kept in mind! 127 - Status flags cannot be controlled directly 128 (there are no set/clear/invert operations) 129 130 The last two rules allows efficent mapping of status flags. 131 For example the arithmetic and multiply overflow flag is 132 mapped to the same overflow flag bit on x86. This is allowed, 133 since no instruction can set both of these flags. When 134 either of them is set by an instruction, the other can 135 have any value (this satisfies the "all other flags are 136 undefined" rule). Therefore mapping two SLJIT flags to the 137 same CPU flag is possible. Even though SLJIT supports 138 a dozen status flags, they can be efficiently mapped 139 to CPUs with only 4 status flags (e.g. ARM or SPARC). 140 141 ---------------------------------------------------------------- 142 Complex instructions 143 ---------------------------------------------------------------- 144 145 We noticed, that introducing complex instructions for common 146 tasks can improve performance. For example, compare and 147 branch instruction sequences can be optimized if certain 148 conditions apply, but these conditions depend on the target 149 CPU. SLJIT can do these optimizations, but it needs to 150 understand the "purpose" of the generated code. Static 151 instruction analysis has a large performance overhead 152 however, so we choose another approach: we introduced 153 complex instruction forms for certain non-atomic tasks. 154 SLJIT can optimize these "instructions" more efficiently 155 since the "purpose" is known to the compiler. These complex 156 instruction forms can often be assembled from other SLJIT 157 instructions, but we recommended to use them since the 158 compiler can optimize them on certain CPUs. 159 160 ---------------------------------------------------------------- 161 Generating functions 162 ---------------------------------------------------------------- 163 164 SLJIT is often used for generating function bodies which are 165 called from C. SLJIT provides two complex instructions for 166 generating function entry and return: sljit_emit_enter and 167 sljit_emit_return. The sljit_emit_enter also initializes the 168 "compiling context" which specify the current register mapping, 169 local space size, etc. configurations. The sljit_set_context 170 can also set this context without emitting any machine 171 instructions. 172 173 This context is important since it affects the compiler, so 174 the first instruction after a compiler is created must be 175 either sljit_emit_enter or sljit_set_context. The context can 176 be changed by calling sljit_emit_enter or sljit_set_context 177 again. 178 179 ---------------------------------------------------------------- 180 All-in-one building 181 ---------------------------------------------------------------- 182 183 Instead of using a separate library, the whole SLJIT 184 compiler infrastructure can be directly included: 185 186 #define SLJIT_CONFIG_STATIC 1 187 #include "sljitLir.c" 188 189 This approach is useful for single file compilers. 190 191 Advantages: 192 - Everything provided by SLJIT is available 193 (no need to include anything else). 194 - Configuring SLJIT is easy 195 (e.g. redefining SLJIT_MALLOC / SLJIT_FREE). 196 - The SLJIT compiler API is hidden from the 197 world which improves securtity. 198 - The C compiler can optimize the SLJIT code 199 generator (e.g. removing unused functions). 200 201 ---------------------------------------------------------------- 202 Types and macros 203 ---------------------------------------------------------------- 204 205 The sljitConfig.h contains those defines, which controls 206 the compiler. The beginning of sljitConfigInternal.h 207 lists architecture specific types and macros provided 208 by SLJIT. Some of these macros: 209 210 SLJIT_DEBUG : enabled by default 211 Enables assertions. Should be disabled in release mode. 212 213 SLJIT_VERBOSE : enabled by default 214 When this macro is enabled, the sljit_compiler_verbose 215 function can be used to dump SLJIT instructions. 216 Otherwise this function is not available. Should be 217 disabled in release mode. 218 219 SLJIT_SINGLE_THREADED : disabled by default 220 Single threaded programs can define this flag which 221 eliminates the pthread dependency. 222 223 sljit_sw, sljit_uw, etc. : 224 It is recommended to use these types instead of long, 225 intptr_t, etc. Improves readability / portability of 226 the code. 227