Home | History | Annotate | Line # | Download | only in tutorial
      1 
      2 <html>
      3 
      4 <head>
      5   <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      6   <title>SLJIT tutorial</title>
      7 
      8   <style type="text/css">
      9     body {
     10       background-color: #707070;
     11       color: #000000;
     12       font-family: "garamond"
     13     }
     14     td.main {
     15       background-color: #ffffff;
     16       color: #000000;
     17       font-family: "garamond"
     18     }
     19   </style>
     20 </head>
     21 
     22 <body>
     23 
     24 <center>
     25 <table width="760" cellspacing=0 cellpadding=0>
     26 <tr height=20><td width=20 class="main"></td><td width=720 class="main"></td><td width=20 class="main"></td></tr>
     27 <tr><td width=20 class="main"></td><td width=720 class="main">
     28 
     29 <center>
     30 <a href="http://sourceforge.net"><img src="http://sflogo.sourceforge.net/sflogo.php?group_id=248047&type=2" width="125" height="37" border="0" alt="SourceForge.net Logo" /></a>
     31 </center>
     32 <h1><center>SLJIT tutorial</center></h1>
     33 
     34 <h2>Before started</h2>
     35 
     36 <a href="">Download the tutorial sources</a><br>
     37 <br>
     38 SLJIT is a light-weight, platform independent JIT compiler, it's easy to
     39 embed to your own project, as a result of its 'stack-less', SLJIT have
     40 some limit to register usage.<br>
     41 <br>
     42 Here is some other JIT compiler I digged these days, place here if you have interest:<br>
     43 
     44 <ul>
     45   <b>Libjit/liblighning:</b> - the backend of GNU.net<br>
     46   <b>Libgccjit:</b> - introduced in GCC5.0, its different from other JIT lib, this
     47                     one seems like constructing a C code, it use the backend of GCC.<br>
     48   <b>AsmJIT:</b> - branch from the famous V8 project (JavaScript engine in Chrome),
     49                    support only X86/X86_64.<br>
     50   <b>DynASM:</b> - used in LuaJIT.<br>
     51 </ul>
     52 
     53 <br>
     54 AsmJIT and DynASM work in the instruction level, look like coding with ASM language,
     55 SLJIT look like ASM also, but it hide the detail of the specific CPU, make it more
     56 common, and become portable, libjit work on higher layer, libgccjit as I mention,
     57 really you are constructing the C code.<br>
     58 
     59 <h2>First program</h2>
     60 
     61 Usage of SLJIT:
     62 <ul>
     63 1. #include "sljitLir.h" in the head of your C/C++ program<br>
     64 2. Compile with sljit_src/sljitLir.c<br>
     65 </ul>
     66 
     67 ALL example can be compile like this:
     68 <ul>
     69 gcc -Wall -Ipath/to/sljit_src -DSLJIT_CONFIG_AUTO=1 \<br>
     70   <ul><b>xxx.c</b> path/to/sljit_src/sljitLir.c -o program</ul>
     71 </ul>
     72 
     73 OK, let's take a look at the first program, this program we create a function that
     74 return the sum of 3 arguments.<br>
     75 <br>
     76 <div style='font-family:Courier New;font-size:11px'>
     77 <ul>
     78 #include "sljitLir.h"<br>
     79  <br>
     80 #include &lt;stdio.h&gt;<br>
     81 #include &lt;stdlib.h&gt;<br>
     82  <br>
     83 typedef sljit_sw (*func3_t)(sljit_sw a, sljit_sw b, sljit_sw c);<br>
     84  <br>
     85 static int add3(sljit_sw a, sljit_sw b, sljit_sw c)<br>
     86 {<br>
     87    <ul>
     88     void *code;<br>
     89     sljit_sw len;<br>
     90     func3_t func;<br>
     91    <br>
     92     /* Create a SLJIT compiler */<br>
     93     struct sljit_compiler *C = sljit_create_compiler();<br>
     94    <br>
     95     /* Start a context(function entry), have 3 arguments, discuss later */<br>
     96     sljit_emit_enter(C, 0,  3,  1, 3, 0, 0, 0);<br>
     97    <br>
     98     /* The first arguments of function is register SLJIT_S0, 2nd, SLJIT_S1, etc.  */<br>
     99     /* R0 = first */<br>
    100     sljit_emit_op1(C, SLJIT_MOV, SLJIT_R0, 0, SLJIT_S0, 0);<br>
    101    <br>
    102     /* R0 = R0 + second */<br>
    103     sljit_emit_op2(C, SLJIT_ADD, SLJIT_R0, 0, SLJIT_R0, 0, SLJIT_S1, 0);<br>
    104    <br>
    105     /* R0 = R0 + third */<br>
    106     sljit_emit_op2(C, SLJIT_ADD, SLJIT_R0, 0, SLJIT_R0, 0, SLJIT_S2, 0);<br>
    107    <br>
    108     /* This statement mov R0 to RETURN REG and return */<br>
    109     /* in fact, R0 is RETURN REG itself */<br>
    110     sljit_emit_return(C, SLJIT_MOV, SLJIT_R0, 0);<br>
    111    <br>
    112     /* Generate machine code */<br>
    113     code = sljit_generate_code(C);<br>
    114     len = sljit_get_generated_code_size(C);<br>
    115    <br>
    116     /* Execute code */<br>
    117     func = (func3_t)code;<br>
    118     printf("func return %ld\n", func(a, b, c));<br>
    119    <br>
    120     /* dump_code(code, len); */<br>
    121    <br>
    122     /* Clean up */<br>
    123     sljit_free_compiler(C);<br>
    124     sljit_free_code(code);<br>
    125     return 0;<br>
    126    </ul>
    127 }<br>
    128  <br>
    129 int main()<br>
    130 {<br>
    131    <ul>
    132     return add3(4, 5, 6);<br>
    133    </ul>
    134 }<br>
    135 </ul>
    136 </div>
    137 
    138 <br>
    139 The function sljit_emit_enter create a context, save some registers to the stack,
    140 and create a call-frame, sljit_emit_return restore the saved-register and clean-up
    141 the frame. SLJIT is design to embed into other application, the code it generated
    142 has to follow some basic rule.<br>
    143 <br>
    144 The standard called Application Binary Interface, or ABI for short, here is a
    145 document for X86_64 CPU (<a href="http://www.x86-64.org/documentation/abi.pdf">ABI.pdf</a>),
    146 almost all Linux/Unix follow this standard. MS windows has its own, read this for more:
    147 <a href="http://en.wikipedia.org/wiki/X86_calling_conventions">X86_calling_conventions</a><br>
    148 <br>
    149 When reading the doc of sljit_emit_emter, the parameters 'saveds' and 'scratchs' make
    150 me confused. The fact is, the registers in CPU has different functions in the ABI spec,
    151 some of them used to pass arguments, some of them are 'callee-saved', some of them are
    152 'temporary used', take X86_64 for example, RAX, R10, R11 are temporary used, that means,
    153 they may be changed after a call instruction. And RBX, R12-R15 are callee-saved, those
    154 will remain the same values after the call. The rule is, every function should save
    155 those registers before using it.<br>
    156 <br>
    157 Fortunately, SLJIT have done the most for us, SLJIT_S[0-9] represent those 'safe'
    158 registers, SLJIT_R[0-9] however, only for 'temporary used'.<br>
    159 <br>
    160 When a function start, SLJIT move the function arguments to S0, S1, S2 register, it
    161 means function arguments are always 'safe' in the context, the limit of using stack for
    162 storing arguments make SLJIT support only 3 arguments max.<br>
    163 <br>
    164 Sljit_emit_opX is easy to understand, in SLJIT a data value is represented by 2
    165 parameters, it can be a register, an In-memory data, or an immediate number.<br>
    166 <br>
    167 
    168 <table align="center" cellspacing="0">
    169 <tr><td>First parameter</td> 	<td>Second parameter</td>	<td>Meaning</td></tr>
    170 <tr><td>SLJIT_R*, SLJIT_S*</td>	<td>0</td>			<td>Temp/saved registers</td></tr>
    171 <tr><td>SLJIT_IMM</td>			<td>Number</td>		<td>Immediate number</td></tr>
    172 <tr><td>SLJIT_MEM</td>			<td>Address</td>	<td>In-mem data with Absolute address</td></tr>
    173 <tr><td>SLJIT_MEM1(r)</td>		<td>Offset</td>		<td>In-mem data in [R + offset]</td></tr>
    174 <tr><td>SLJIT_MEM2(r1, r2)</td>	<td>Shift(size)</td>		<td>In-mem array, R1 as base address, R2 as index, <br>
    175 								Shift as size(0 for bytes, 1 for shorts, 2 for <br>
    176 								4bytes, 3 for 8bytes)</td></tr>
    177 </table>
    178 
    179 <h2>Branch</h2>
    180 <div style='font-family:Courier New;font-size:11px'>
    181 <ul>
    182 #include "sljitLir.h"<br>
    183  <br>
    184 #include &lt;stdio.h&gt;<br>
    185 #include &lt;stdlib.h&gt;<br>
    186  <br>
    187 typedef sljit_sw (*func3_t)(sljit_sw a, sljit_sw b, sljit_sw c);<br>
    188  <br>
    189 /*<br>
    190  This example, we generate a function like this:<br>
    191  <br>
    192 sljit_sw func(sljit_sw a, sljit_sw b, sljit_sw c)<br>
    193 {<br>
    194     <ul>
    195     if ((a & 1) == 0)<br>
    196     <ul>
    197         return c;<br>
    198     </ul>
    199     return b;<br>
    200 </ul>
    201 }<br>
    202  <br>
    203  */<br>
    204 static int branch(sljit_sw a, sljit_sw b, sljit_sw c)<br>
    205 {<br>
    206    <ul>
    207     void *code;<br>
    208     sljit_uw len;<br>
    209     func3_t func;<br>
    210    <br>
    211     struct sljit_jump *ret_c;<br>
    212     struct sljit_jump *out;<br>
    213    <br>
    214     /* Create a SLJIT compiler */<br>
    215     struct sljit_compiler *C = sljit_create_compiler();<br>
    216    <br>
    217     /* 3 arg, 1 temp reg, 3 save reg */<br>
    218     sljit_emit_enter(C, 0,  3,  1, 3, 0, 0, 0);<br>
    219    <br>
    220     /* R0 = a & 1, S0 is argument a */<br>
    221     sljit_emit_op2(C, SLJIT_AND, SLJIT_R0, 0, SLJIT_S0, 0, SLJIT_IMM, 1);<br>
    222    <br>
    223     /* if R0 == 0 then jump to ret_c, where is ret_c? we assign it later */<br>
    224     ret_c = sljit_emit_cmp(C, SLJIT_EQUAL, SLJIT_R0, 0, SLJIT_IMM, 0);<br>
    225    <br>
    226     /* R0 = b, S1 is argument b */<br>
    227     sljit_emit_op1(C, SLJIT_MOV, SLJIT_RETURN_REG, 0, SLJIT_S1, 0);<br>
    228    <br>
    229     /* jump to out */<br>
    230     out = sljit_emit_jump(C, SLJIT_JUMP);<br>
    231    <br>
    232     /* here is the 'ret_c' should jump, we emit a label and set it to ret_c */<br>
    233     sljit_set_label(ret_c, sljit_emit_label(C));<br>
    234    <br>
    235     /* R0 = c, S2 is argument c */<br>
    236     sljit_emit_op1(C, SLJIT_MOV, SLJIT_RETURN_REG, 0, SLJIT_S2, 0);<br>
    237    <br>
    238     /* here is the 'out' should jump */<br>
    239     sljit_set_label(out, sljit_emit_label(C));<br>
    240    <br>
    241     /* end of function */<br>
    242     sljit_emit_return(C, SLJIT_MOV, SLJIT_RETURN_REG, 0);<br>
    243    <br>
    244     /* Generate machine code */<br>
    245     code = sljit_generate_code(C);<br>
    246     len = sljit_get_generated_code_size(C);<br>
    247    <br>
    248     /* Execute code */<br>
    249     func = (func3_t)code;<br>
    250     printf("func return %ld\n", func(a, b, c));<br>
    251    <br>
    252     /* dump_code(code, len); */<br>
    253    <br>
    254     /* Clean up */<br>
    255     sljit_free_compiler(C);<br>
    256     sljit_free_code(code);<br>
    257     return 0;<br>
    258 </ul>
    259 }<br>
    260  <br>
    261 int main()<br>
    262 {<br>
    263 <ul>
    264     return branch(4, 5, 6);<br>
    265 </ul>
    266 }<br>
    267 </ul>
    268 </div>
    269 
    270 The key to implement branch is 'struct sljit_jump' and 'struct sljit_label',
    271 the 'jump' contain a jump instruction, it does not know where to jump unless
    272 you set a label to it, the 'label' is a code address just like label in ASM
    273 language.<br>
    274 <br>
    275 sljit_emit_cmp/sljit_emit_jump generate a conditional/unconditional jump,
    276 take the statement<br>
    277 <ul>
    278 ret_c = sljit_emit_cmp(C, SLJIT_EQUAL, SLJIT_R0, 0, SLJIT_IMM, 0);<br>
    279 </ul>
    280 For example, it create a jump instruction, the condition is R0 equals 0, and
    281 the position of jumping will assign later with the sljit_set_label statement.<br>
    282 <br>
    283 In this example, it creates a branch like this:<br>
    284 <ul>
    285     <ul>
    286     R0 = a & 1;<br>
    287     if R0 == 0 then goto ret_c;<br>
    288     R0 = b;<br>
    289     goto out;<br>
    290     </ul>
    291 ret_c:<br>
    292     <ul>
    293     R0 = c;<br>
    294     </ul>
    295 out:<br>
    296     <ul>
    297     return R0;<br>
    298     </ul>
    299 </ul>
    300 <br>
    301 This is how high-level-language compiler handle branch.<br>
    302 <br>
    303 
    304 <h2>Loop</h2>
    305 
    306 Loop example is similar with Branch.
    307 
    308 <div style='font-family:Courier New;font-size:11px'>
    309 <ul>
    310 /*
    311  This example, we generate a function like this:<br>
    312  <br>
    313 sljit_sw func(sljit_sw a, sljit_sw b)<br>
    314 {<br>
    315 <ul>
    316     sljit_sw i;<br>
    317     sljit_sw ret = 0;<br>
    318     for (i = 0; i &lt; a; ++i) {<br>
    319     <ul>
    320         ret += b;<br>
    321     </ul>
    322     }<br>
    323     return ret;<br>
    324 </ul>
    325 }<br>
    326 */<br>
    327 <br>
    328 <ul>
    329     /* 2 arg, 2 temp reg, 2 saved reg */<br>
    330     sljit_emit_enter(C, 0, 2, 2, 2, 0, 0, 0);<br>
    331     <br>
    332     /* R0 = 0 */<br>
    333     sljit_emit_op2(C, SLJIT_XOR, SLJIT_R1, 0, SLJIT_R1, 0, SLJIT_R1, 0);<br>
    334     /* RET = 0 */<br>
    335     sljit_emit_op1(C, SLJIT_MOV, SLJIT_RETURN_REG, 0, SLJIT_IMM, 0);<br>
    336     /* loopstart: */<br>
    337     loopstart = sljit_emit_label(C);<br>
    338     /* R1 &gt;= a --> jump out */<br>
    339     out = sljit_emit_cmp(C, SLJIT_GREATER_EQUAL, SLJIT_R1, 0, SLJIT_S0, 0);<br>
    340     /* RET += b */<br>
    341     sljit_emit_op2(C, SLJIT_ADD, SLJIT_RETURN_REG, 0, SLJIT_RETURN_REG, 0, SLJIT_S1, 0);<br>
    342     /* R1 += 1 */<br>
    343     sljit_emit_op2(C, SLJIT_ADD, SLJIT_R1, 0, SLJIT_R1, 0, SLJIT_IMM, 1);<br>
    344     /* jump loopstart */<br>
    345     sljit_set_label(sljit_emit_jump(C, SLJIT_JUMP), loopstart);<br>
    346     /* out: */<br>
    347     sljit_set_label(out, sljit_emit_label(C));<br>
    348     <br>
    349     /* return RET */<br>
    350     sljit_emit_return(C, SLJIT_MOV, SLJIT_RETURN_REG, 0);<br>
    351 </ul>
    352 </ul>
    353 </div>
    354 
    355 After this example, you are ready to construct any program that contain complex branch
    356 and loop.<br>
    357 <br>
    358 Here is an interesting fact, 'xor reg, reg' is better than 'mov reg, 0', it save 2 bytes
    359 in X86 machine.<br>
    360 <br>
    361 I will give only the key code in the rest of this tutorial, the full source of each
    362 chapter can be found in the attachment.<br>
    363 
    364 
    365 <h2>Call external function</h2>
    366 
    367 It's easy to call an external function in SLJIT, we use sljit_emit_ijump with SLJIT_CALL*
    368 operation to do so.<br>
    369 <br>
    370 SLJIT_CALL[N] is use to call a function with N arguments, SLJIT has only SLJIT_CALL0,
    371 CALL1, CALL2, CALL3, which means you can call a function with 3 arguments in max(that
    372 disappoint me, no chance to call fwrite in SLJIT), the arguments for the callee function
    373 are passed from SLJIT_R0, R1 and R2. Keep in mind to maintain those 'temp registers'.<br>
    374 <br>
    375 Assume that we have an external function:<br>
    376 <ul>
    377     sljit_sw print_num(sljit_sw a);
    378 </ul>
    379 
    380 JIT code to call print_num(S1):
    381 
    382 <div style='font-family:Courier New;font-size:11px'>
    383 <ul>
    384     /* R0 = S1; */<br>
    385     sljit_emit_op1(C, SLJIT_MOV, SLJIT_R0, 0, SLJIT_S1, 0);<br>
    386     /* print_num(R0) */<br>
    387     sljit_emit_ijump(C, SLJIT_CALL1, SLJIT_IMM, SLJIT_FUNC_OFFSET(print_num));<br>
    388 </ul>
    389 </div>
    390 <br>
    391 This code call a imm-data(address of print_num), which is linked properly when the
    392 program loaded. There no problem in 1-time compile and execute, but when you planning
    393 to save to file and load/execute next time, that address may not correct as you expect,
    394 in some platform that support PIC, the address of print_num may relocate to another
    395 address in run-time. Check this out:
    396 <a href="http://en.wikipedia.org/wiki/Position-independent_code">PIC</a><br>
    397 <br>
    398 
    399 <h2>Structure access</h2>
    400 
    401 SLJIT use SLJIT_MEM1 to implement [Reg + offset] memory access.<br>
    402 <div style='font-family:Courier New;font-size:11px'>
    403 <ul>
    404 struct point_st {<br>
    405     <ul>
    406     sljit_sw x;<br>
    407     int y;<br>
    408     short z;<br>
    409     char d;<br>
    410     char e;<br>
    411     </ul>
    412 };<br>
    413 <br>
    414 sljit_emit_op1(C, SLJIT_MOV_SI, SLJIT_R0, 0, SLJIT_MEM1(SLJIT_S0),<br>
    415 <ul>
    416 SLJIT_OFFSETOF(struct point_st, y));<br>
    417 </ul>
    418 </ul>
    419 </div>
    420 
    421 In this case, SLJIT_S0 is the address of the point_st structure, offset of member 'y'
    422 is determined in compile time, the important MOV operation always comes with a
    423 'signed/size' postfix, like this one _SI means 'signed 32bits integer', the postfix
    424 list:<br>
    425 <ul>
    426    <b>UB</b> = unsigned byte (8 bit)<br>
    427    <b>SB</b> = signed byte (8 bit)<br>
    428    <b>UH</b> = unsigned half (16 bit)<br>
    429    <b>SH</b> = signed half (16 bit)<br>
    430    <b>UI</b> = unsigned int (32 bit)<br>
    431    <b>SI</b> = signed int (32 bit)<br>
    432    <b>P</b>  = pointer (sljit_p) size<br>
    433 </ul>
    434 
    435 <h2>Array accessing</h2>
    436 
    437 SLJIT use SLJIT_MEM2 to access arrays, like this:<br>
    438 
    439 <div style='font-family:Courier New;font-size:11px'>
    440 <ul>
    441 sljit_emit_op1(C, SLJIT_MOV, SLJIT_R0, 0, SLJIT_MEM2(SLJIT_S0, SLJIT_S2),<br>
    442 <ul>
    443 SLJIT_WORD_SHIFT);
    444 </ul>
    445 </ul>
    446 </div>
    447 
    448 This statement generates a code like this:<br>
    449 <ul>
    450 WORD S0[];<br>
    451 R0 = S0[S2]<br>
    452 </ul>
    453 <br>
    454 The array S0 is declared to be WORD, which will be sizeof(sljit_sw) in length.
    455 Sljit use a 'shift' for length representation: (0 for single byte, 1 for 2
    456 bytes, 2 for 4 bytes, 3 for 8bytes)<br>
    457 <br>
    458 The file array_access.c demonstrate a array-print example, should be easy
    459 to understand.<br>
    460 
    461 <h2>Local variables</h2>
    462 
    463 SLJIT provide SLJIT_MEM1(SLJIT_SP) to access the reserved space in
    464 sljit_emit_enter's last parameter.<br>
    465 In this example we have to pass the address to print_arr, local variable
    466 is the only choice.<br>
    467 
    468 <div style='font-family:Courier New;font-size:11px'>
    469 <ul>
    470     /* reserved space in stack for sljit_sw arr[3] */<br>
    471     sljit_emit_enter(C, 0,  3,  2, 3, 0, 0, 3 * sizeof(sljit_sw));<br>
    472     /*                  opt arg R  S  FR FS local_size */<br>
    473    <br>
    474     /* arr[0] = S0, SLJIT_SP is the init address of local var */<br>
    475     sljit_emit_op1(C, SLJIT_MOV, SLJIT_MEM1(SLJIT_SP), 0, SLJIT_S0, 0);<br>
    476     /* arr[1] = S1 */<br>
    477     sljit_emit_op1(C, SLJIT_MOV, SLJIT_MEM1(SLJIT_SP), 1 * sizeof(sljit_sw), SLJIT_S1, 0);<br>
    478     /* arr[2] = S2 */<br>
    479     sljit_emit_op1(C, SLJIT_MOV, SLJIT_MEM1(SLJIT_SP), 2 * sizeof(sljit_sw), SLJIT_S2, 0);<br>
    480    <br>
    481     /* R0 = arr; in fact SLJIT_SP is the address of arr, but can't do so in SLJIT */<br>
    482     sljit_get_local_base(C, SLJIT_R0, 0, 0);   /* get the address of local variables */<br>
    483     sljit_emit_op1(C, SLJIT_MOV, SLJIT_R1, 0, SLJIT_IMM, 3);   /* R1 = 3; */<br>
    484     sljit_emit_ijump(C, SLJIT_CALL2, SLJIT_IMM, SLJIT_FUNC_OFFSET(print_arr));<br>
    485     sljit_emit_return(C, SLJIT_MOV, SLJIT_R0, 0);<br>
    486 </ul>
    487 </div>
    488 <br>
    489 SLJIT_SP can only be used in SLJIT_MEM1(SLJIT_SP). In this case, SP is the
    490 address of 'arr', but we cannot assign it to Reg using SLJIT_MOV opr,
    491 instead, we use sljit_get_local_base, which load the address and offset of
    492 local variable to the target.<br>
    493 
    494 <h2>Brainfuck compiler</h2>
    495 
    496 Ok, the basic usage of SLJIT ends here, with more detail, I suggest reading
    497 sljitLir.h directly, having fun hacking the wonder of SLJIT!<br>
    498 <br>
    499 The brainfuck machine introduction can be found here:
    500 <a href="http://en.wikipedia.org/wiki/Brainfuck">Brainfuck</a><br>
    501 <br>
    502 
    503 <h2>Extra</h2>
    504 
    505 1. Dump_code function<br>
    506 SLJIT didn't provide disassemble functional, this is a simple function to do this(X86 only)<br>
    507 <br>
    508 
    509 <div style='font-family:Courier New;font-size:11px'>
    510 <ul>
    511 static void dump_code(void *code, sljit_uw len)<br>
    512 {<br>
    513 <ul>
    514     FILE *fp = fopen("/tmp/slj_dump", "wb");<br>
    515     if (!fp)<br>
    516     <ul>
    517         return;<br>
    518     </ul>
    519     fwrite(code, len, 1, fp);<br>
    520     fclose(fp);<br>
    521 </ul>
    522 #if defined(SLJIT_CONFIG_X86_64)<br>
    523 <ul>
    524     system("objdump -b binary -m l1om -D /tmp/slj_dump");<br>
    525 </ul>
    526 #elif defined(SLJIT_CONFIG_X86_32)<br>
    527 <ul>
    528     system("objdump -b binary -m i386 -D /tmp/slj_dump");<br>
    529 </ul>
    530 #endif<br>
    531 }
    532 </ul>
    533 </div>
    534 
    535 The branch example disassembling:<br>
    536  <br>
    537 0000000000000000 &lt;.data&gt;:<br>
    538 <ul>
    539 <table>
    540 <tr><td>0:</td><td>53</td><td>push   %rbx</td></tr>
    541 <tr><td>1:</td><td>41 57</td><td>push   %r15</td></tr>
    542 <tr><td>3:</td><td>41 56</td><td>push   %r14</td></tr>
    543 <tr><td>5:</td><td>48 8b df</td><td>mov    %rdi,%rbx</td></tr>
    544 <tr><td>8:</td><td>4c 8b fe</td><td>mov    %rsi,%r15</td></tr>
    545 <tr><td>b:</td><td>4c 8b f2</td><td>mov    %rdx,%r14</td></tr>
    546 <tr><td>e:</td><td>48 83 ec 10</td><td>sub    $0x10,%rsp</td></tr>
    547 <tr><td>12:</td><td>48 89 d8</td><td>mov    %rbx,%rax</td></tr>
    548 <tr><td>15:</td><td>48 83 e0 01</td><td>and    $0x1,%rax</td></tr>
    549 <tr><td>19:</td><td>48 83 f8 00</td><td>cmp    $0x0,%rax</td></tr>
    550 <tr><td>1d:</td><td>74 05</td><td>je     0x24</td></tr>
    551 <tr><td>1f:</td><td>4c 89 f8</td><td>mov    %r15,%rax</td></tr>
    552 <tr><td>22:</td><td>eb 03</td><td>jmp    0x27</td></tr>
    553 <tr><td>24:</td><td>4c 89 f0</td><td>mov    %r14,%rax</td></tr>
    554 <tr><td>27:</td><td>48 83 c4 10</td><td>add    $0x10,%rsp</td></tr>
    555 <tr><td>2b:</td><td>41 5e</td><td>pop    %r14</td></tr>
    556 <tr><td>2d:</td><td>41 5f</td><td>pop    %r15</td></tr>
    557 <tr><td>2f:</td><td>5b</td><td>pop    %rbx</td></tr>
    558 <tr><td>30:</td><td>c3</td><td>retq</td></tr>
    559 </table>
    560 </ul>
    561 <br>
    562 with GCC -O2<br>
    563 0000000000000000 &lt;func&gt;:<br>
    564 <ul>
    565 <table>
    566 <tr><td>0:</td><td>48 89 d0</td><td>mov    %rdx,%rax</td></tr>
    567 <tr><td>3:</td><td>83 e7 01</td><td>and    $0x1,%edi</td></tr>
    568 <tr><td>6:</td><td>48 0f 45 c6</td><td>cmovne %rsi,%rax</td></tr>
    569 <tr><td>a:</td><td>c3</td><td>retq</td></tr>
    570 </table>
    571 </ul>
    572 <br>
    573 Err... Ok, the optimization here may be weak, or, optimization there is crazy... :-)<br>
    574 
    575 <table width="100%" cellspacing=0 cellpadding=0>
    576 <tr><td align=right>By wenxichang#163.com, 2015.5.10</td></tr></table>
    577 
    578 </td><td width=20 class="main"></td></tr>
    579 <tr height=20><td width=20 class="main"></td><td width=720 class="main"></td><td width=20 class="main"></td></tr>
    580 </table>
    581 </center>
    582 
    583 </body>
    584 </html>
    585