Home | History | Annotate | Line # | Download | only in doc
      1 The following text is a brief overview of those key
      2 principles which are useful to know when generating code
      3 with SLJIT. Further details can be found in sljitLir.h.
      4 
      5 ----------------------------------------------------------------
      6   What is SLJIT?
      7 ----------------------------------------------------------------
      8 
      9 SLJIT is a platform independent assembler which
     10   - provides access to common CPU features
     11   - can be easily ported to wide-spread CPU
     12     architectures (e.g. x86, ARM, POWER, MIPS, SPARC)
     13 
     14 The key challenge of this project is finding a common
     15 subset of CPU features which
     16   - covers traditional assembly level programming
     17   - can be translated to machine code efficiently
     18 
     19 This aim is achieved by selecting those instructions / CPU
     20 features which are either available on all platforms or
     21 simulating them has a low performance overhead.
     22 
     23 For example, some SLJIT instructions support base register
     24 pre-update when [base+offs] memory accessing mode is used.
     25 Although this feature is only available on ARM and POWER
     26 CPUs, the simulation overhead is low on other CPUs.
     27 
     28 ----------------------------------------------------------------
     29   The generic CPU model of SLJIT
     30 ----------------------------------------------------------------
     31 
     32 The CPU has
     33   - integer registers, which can store either an
     34     int32_t (4 byte) or intptr_t (4 or 8 byte) value
     35   - floating point registers, which can store either a
     36     single (4 byte) or double (8 byte) precision value
     37   - boolean status flags
     38 
     39 *** Integer registers:
     40 
     41 The most important rule is: when a source operand of
     42 an instruction is a register, the data type of the
     43 register must match the data type expected by an
     44 instruction.
     45 
     46 For example, the following code snippet
     47 is a valid instruction sequence:
     48 
     49     sljit_emit_op1(compiler, SLJIT_IMOV,
     50         SLJIT_R0, 0, SLJIT_MEM1(SLJIT_R1), 0);
     51     // An int32_t value is loaded into SLJIT_R0
     52     sljit_emit_op1(compiler, SLJIT_INEG,
     53         SLJIT_R0, 0, SLJIT_R0, 0);
     54     // the int32_t value in SLJIT_R0 is negated
     55     // and the type of the result is still int32_t
     56 
     57 The next code snippet is not allowed:
     58 
     59     sljit_emit_op1(compiler, SLJIT_MOV,
     60         SLJIT_R0, 0, SLJIT_MEM1(SLJIT_R1), 0);
     61     // An intptr_t value is loaded into SLJIT_R0
     62     sljit_emit_op1(compiler, SLJIT_INEG,
     63         SLJIT_R0, 0, SLJIT_R0, 0);
     64     // The result of SLJIT_INEG instruction
     65     // is undefined. Even crash is possible
     66     // (e.g. on MIPS-64).
     67 
     68 However, it is always allowed to overwrite a
     69 register regardless its previous value:
     70 
     71     sljit_emit_op1(compiler, SLJIT_MOV,
     72         SLJIT_R0, 0, SLJIT_MEM1(SLJIT_R1), 0);
     73     // An intptr_t value is loaded into SLJIT_R0
     74     sljit_emit_op1(compiler, SLJIT_IMOV,
     75         SLJIT_R0, 0, SLJIT_MEM1(SLJIT_R2), 0);
     76     // From now on SLJIT_R0 contains an int32_t
     77     // value. The previous value is discarded.
     78 
     79 Type conversion instructions are provided to convert an
     80 int32_t value to an intptr_t value and vice versa. In
     81 certain architectures these conversions are nops (no
     82 instructions are emitted).
     83 
     84 Memory accessing:
     85 
     86 Registers arguments of SLJIT_MEM1 / SLJIT_MEM2 addressing
     87 modes must contain intptr_t data.
     88 
     89 Signed / unsigned values:
     90 
     91 Most operations are executed in the same way regardless
     92 the value is signed or unsigned. These operations have
     93 only one instruction form (e.g. SLJIT_ADD / SLJIT_MUL).
     94 Instructions where the result depends on the sign have
     95 two forms (e.g. integer division, long multiply).
     96 
     97 *** Floating point registers
     98 
     99 Floating point registers can either contain a single
    100 or double precision value. Similar to integer registers,
    101 the data type of the value stored in a source register
    102 must match the data type expected by the instruction.
    103 Otherwise the result is undefined (even crash is possible).
    104 
    105 Rounding:
    106 
    107 Similar to standard C, floating point computation
    108 results are rounded toward zero.
    109 
    110 *** Boolean status flags:
    111 
    112 Conditional branches usually depend on the value
    113 of CPU status flags. These status flags are boolean
    114 values and can be set by certain instructions.
    115 
    116 To achive maximum efficiency and portability, the
    117 following rules were introduced:
    118   - Most instructions can freely modify these status
    119     flags except if SLJIT_KEEP_FLAGS is passed.
    120   - The SLJIT_KEEP_FLAGS option may have a performance
    121     overhead, so it should only be used when necessary.
    122   - The SLJIT_SET_E, SLJIT_SET_U, etc. options can
    123     force an instruction to correctly set the
    124     specified status flags. However, all other
    125     status flags are undefined. This rule must
    126     always be kept in mind!
    127   - Status flags cannot be controlled directly
    128     (there are no set/clear/invert operations)
    129 
    130 The last two rules allows efficent mapping of status flags.
    131 For example the arithmetic and multiply overflow flag is
    132 mapped to the same overflow flag bit on x86. This is allowed,
    133 since no instruction can set both of these flags. When
    134 either of them is set by an instruction, the other can
    135 have any value (this satisfies the "all other flags are
    136 undefined" rule). Therefore mapping two SLJIT flags to the
    137 same CPU flag is possible. Even though SLJIT supports
    138 a dozen status flags, they can be efficiently mapped
    139 to CPUs with only 4 status flags (e.g. ARM or SPARC).
    140 
    141 ----------------------------------------------------------------
    142   Complex instructions
    143 ----------------------------------------------------------------
    144 
    145 We noticed, that introducing complex instructions for common
    146 tasks can improve performance. For example, compare and
    147 branch instruction sequences can be optimized if certain
    148 conditions apply, but these conditions depend on the target
    149 CPU. SLJIT can do these optimizations, but it needs to
    150 understand the "purpose" of the generated code. Static
    151 instruction analysis has a large performance overhead
    152 however, so we choose another approach: we introduced
    153 complex instruction forms for certain non-atomic tasks.
    154 SLJIT can optimize these "instructions" more efficiently
    155 since the "purpose" is known to the compiler. These complex
    156 instruction forms can often be assembled from other SLJIT
    157 instructions, but we recommended to use them since the
    158 compiler can optimize them on certain CPUs.
    159 
    160 ----------------------------------------------------------------
    161   Generating functions
    162 ----------------------------------------------------------------
    163 
    164 SLJIT is often used for generating function bodies which are
    165 called from C. SLJIT provides two complex instructions for
    166 generating function entry and return: sljit_emit_enter and
    167 sljit_emit_return. The sljit_emit_enter also initializes the
    168 "compiling context" which specify the current register mapping,
    169 local space size, etc. configurations. The sljit_set_context
    170 can also set this context without emitting any machine
    171 instructions.
    172 
    173 This context is important since it affects the compiler, so
    174 the first instruction after a compiler is created must be
    175 either sljit_emit_enter or sljit_set_context. The context can
    176 be changed by calling sljit_emit_enter or sljit_set_context
    177 again.
    178 
    179 ----------------------------------------------------------------
    180   All-in-one building
    181 ----------------------------------------------------------------
    182 
    183 Instead of using a separate library, the whole SLJIT
    184 compiler infrastructure can be directly included:
    185 
    186 #define SLJIT_CONFIG_STATIC 1
    187 #include "sljitLir.c"
    188 
    189 This approach is useful for single file compilers.
    190 
    191 Advantages:
    192   - Everything provided by SLJIT is available
    193     (no need to include anything else).
    194   - Configuring SLJIT is easy
    195     (e.g. redefining SLJIT_MALLOC / SLJIT_FREE).
    196   - The SLJIT compiler API is hidden from the
    197     world which improves securtity.
    198   - The C compiler can optimize the SLJIT code
    199     generator (e.g. removing unused functions).
    200 
    201 ----------------------------------------------------------------
    202   Types and macros
    203 ----------------------------------------------------------------
    204 
    205 The sljitConfig.h contains those defines, which controls
    206 the compiler. The beginning of sljitConfigInternal.h
    207 lists architecture specific types and macros provided
    208 by SLJIT. Some of these macros:
    209 
    210 SLJIT_DEBUG : enabled by default
    211   Enables assertions. Should be disabled in release mode.
    212 
    213 SLJIT_VERBOSE : enabled by default
    214   When this macro is enabled, the sljit_compiler_verbose
    215   function can be used to dump SLJIT instructions.
    216   Otherwise this function is not available. Should be
    217   disabled in release mode.
    218 
    219 SLJIT_SINGLE_THREADED : disabled by default
    220   Single threaded programs can define this flag which
    221   eliminates the pthread dependency.
    222 
    223 sljit_sw, sljit_uw, etc. :
    224   It is recommended to use these types instead of long,
    225   intptr_t, etc. Improves readability / portability of
    226   the code.
    227