Home | History | Annotate | Line # | Download | only in docs
      1 ==========================
      2 UndefinedBehaviorSanitizer
      3 ==========================
      4 
      5 .. contents::
      6    :local:
      7 
      8 Introduction
      9 ============
     10 
     11 UndefinedBehaviorSanitizer (UBSan) is a fast undefined behavior detector.
     12 UBSan modifies the program at compile-time to catch various kinds of undefined
     13 behavior during program execution, for example:
     14 
     15 * Using misaligned or null pointer
     16 * Signed integer overflow
     17 * Conversion to, from, or between floating-point types which would
     18   overflow the destination
     19 
     20 See the full list of available :ref:`checks <ubsan-checks>` below.
     21 
     22 UBSan has an optional run-time library which provides better error reporting.
     23 The checks have small runtime cost and no impact on address space layout or ABI.
     24 
     25 How to build
     26 ============
     27 
     28 Build LLVM/Clang with `CMake <https://llvm.org/docs/CMake.html>`_.
     29 
     30 Usage
     31 =====
     32 
     33 Use ``clang++`` to compile and link your program with ``-fsanitize=undefined``
     34 flag. Make sure to use ``clang++`` (not ``ld``) as a linker, so that your
     35 executable is linked with proper UBSan runtime libraries. You can use ``clang``
     36 instead of ``clang++`` if you're compiling/linking C code.
     37 
     38 .. code-block:: console
     39 
     40   % cat test.cc
     41   int main(int argc, char **argv) {
     42     int k = 0x7fffffff;
     43     k += argc;
     44     return 0;
     45   }
     46   % clang++ -fsanitize=undefined test.cc
     47   % ./a.out
     48   test.cc:3:5: runtime error: signed integer overflow: 2147483647 + 1 cannot be represented in type 'int'
     49 
     50 You can enable only a subset of :ref:`checks <ubsan-checks>` offered by UBSan,
     51 and define the desired behavior for each kind of check:
     52 
     53 * ``-fsanitize=...``: print a verbose error report and continue execution (default);
     54 * ``-fno-sanitize-recover=...``: print a verbose error report and exit the program;
     55 * ``-fsanitize-trap=...``: execute a trap instruction (doesn't require UBSan run-time support).
     56 
     57 Note that the ``trap`` / ``recover`` options do not enable the corresponding
     58 sanitizer, and in general need to be accompanied by a suitable ``-fsanitize=``
     59 flag.
     60 
     61 For example if you compile/link your program as:
     62 
     63 .. code-block:: console
     64 
     65   % clang++ -fsanitize=signed-integer-overflow,null,alignment -fno-sanitize-recover=null -fsanitize-trap=alignment
     66 
     67 the program will continue execution after signed integer overflows, exit after
     68 the first invalid use of a null pointer, and trap after the first use of misaligned
     69 pointer.
     70 
     71 .. _ubsan-checks:
     72 
     73 Available checks
     74 ================
     75 
     76 Available checks are:
     77 
     78   -  ``-fsanitize=alignment``: Use of a misaligned pointer or creation
     79      of a misaligned reference. Also sanitizes assume_aligned-like attributes.
     80   -  ``-fsanitize=bool``: Load of a ``bool`` value which is neither
     81      ``true`` nor ``false``.
     82   -  ``-fsanitize=builtin``: Passing invalid values to compiler builtins.
     83   -  ``-fsanitize=bounds``: Out of bounds array indexing, in cases
     84      where the array bound can be statically determined. The check includes
     85      ``-fsanitize=array-bounds`` and ``-fsanitize=local-bounds``. Note that
     86      ``-fsanitize=local-bounds`` is not included in ``-fsanitize=undefined``.
     87   -  ``-fsanitize=enum``: Load of a value of an enumerated type which
     88      is not in the range of representable values for that enumerated
     89      type.
     90   -  ``-fsanitize=float-cast-overflow``: Conversion to, from, or
     91      between floating-point types which would overflow the
     92      destination. Because the range of representable values for all
     93      floating-point types supported by Clang is [-inf, +inf], the only
     94      cases detected are conversions from floating point to integer types.
     95   -  ``-fsanitize=float-divide-by-zero``: Floating point division by
     96      zero. This is undefined per the C and C++ standards, but is defined
     97      by Clang (and by ISO/IEC/IEEE 60559 / IEEE 754) as producing either an
     98      infinity or NaN value, so is not included in ``-fsanitize=undefined``.
     99   -  ``-fsanitize=function``: Indirect call of a function through a
    100      function pointer of the wrong type (Darwin/Linux, C++ and x86/x86_64
    101      only).
    102   -  ``-fsanitize=implicit-unsigned-integer-truncation``,
    103      ``-fsanitize=implicit-signed-integer-truncation``: Implicit conversion from
    104      integer of larger bit width to smaller bit width, if that results in data
    105      loss. That is, if the demoted value, after casting back to the original
    106      width, is not equal to the original value before the downcast.
    107      The ``-fsanitize=implicit-unsigned-integer-truncation`` handles conversions
    108      between two ``unsigned`` types, while
    109      ``-fsanitize=implicit-signed-integer-truncation`` handles the rest of the
    110      conversions - when either one, or both of the types are signed.
    111      Issues caught by these sanitizers are not undefined behavior,
    112      but are often unintentional.
    113   -  ``-fsanitize=implicit-integer-sign-change``: Implicit conversion between
    114      integer types, if that changes the sign of the value. That is, if the the
    115      original value was negative and the new value is positive (or zero),
    116      or the original value was positive, and the new value is negative.
    117      Issues caught by this sanitizer are not undefined behavior,
    118      but are often unintentional.
    119   -  ``-fsanitize=integer-divide-by-zero``: Integer division by zero.
    120   -  ``-fsanitize=nonnull-attribute``: Passing null pointer as a function
    121      parameter which is declared to never be null.
    122   -  ``-fsanitize=null``: Use of a null pointer or creation of a null
    123      reference.
    124   -  ``-fsanitize=nullability-arg``: Passing null as a function parameter
    125      which is annotated with ``_Nonnull``.
    126   -  ``-fsanitize=nullability-assign``: Assigning null to an lvalue which
    127      is annotated with ``_Nonnull``.
    128   -  ``-fsanitize=nullability-return``: Returning null from a function with
    129      a return type annotated with ``_Nonnull``.
    130   -  ``-fsanitize=objc-cast``: Invalid implicit cast of an ObjC object pointer
    131      to an incompatible type. This is often unintentional, but is not undefined
    132      behavior, therefore the check is not a part of the ``undefined`` group.
    133      Currently only supported on Darwin.
    134   -  ``-fsanitize=object-size``: An attempt to potentially use bytes which
    135      the optimizer can determine are not part of the object being accessed.
    136      This will also detect some types of undefined behavior that may not
    137      directly access memory, but are provably incorrect given the size of
    138      the objects involved, such as invalid downcasts and calling methods on
    139      invalid pointers. These checks are made in terms of
    140      ``__builtin_object_size``, and consequently may be able to detect more
    141      problems at higher optimization levels.
    142   -  ``-fsanitize=pointer-overflow``: Performing pointer arithmetic which
    143      overflows, or where either the old or new pointer value is a null pointer
    144      (or in C, when they both are).
    145   -  ``-fsanitize=return``: In C++, reaching the end of a
    146      value-returning function without returning a value.
    147   -  ``-fsanitize=returns-nonnull-attribute``: Returning null pointer
    148      from a function which is declared to never return null.
    149   -  ``-fsanitize=shift``: Shift operators where the amount shifted is
    150      greater or equal to the promoted bit-width of the left hand side
    151      or less than zero, or where the left hand side is negative. For a
    152      signed left shift, also checks for signed overflow in C, and for
    153      unsigned overflow in C++. You can use ``-fsanitize=shift-base`` or
    154      ``-fsanitize=shift-exponent`` to check only left-hand side or
    155      right-hand side of shift operation, respectively.
    156   -  ``-fsanitize=unsigned-shift-base``: check that an unsigned left-hand side of
    157      a left shift operation doesn't overflow.
    158   -  ``-fsanitize=signed-integer-overflow``: Signed integer overflow, where the
    159      result of a signed integer computation cannot be represented in its type.
    160      This includes all the checks covered by ``-ftrapv``, as well as checks for
    161      signed division overflow (``INT_MIN/-1``), but not checks for
    162      lossy implicit conversions performed before the computation
    163      (see ``-fsanitize=implicit-conversion``). Both of these two issues are
    164      handled by ``-fsanitize=implicit-conversion`` group of checks.
    165   -  ``-fsanitize=unreachable``: If control flow reaches an unreachable
    166      program point.
    167   -  ``-fsanitize=unsigned-integer-overflow``: Unsigned integer overflow, where
    168      the result of an unsigned integer computation cannot be represented in its
    169      type. Unlike signed integer overflow, this is not undefined behavior, but
    170      it is often unintentional. This sanitizer does not check for lossy implicit
    171      conversions performed before such a computation
    172      (see ``-fsanitize=implicit-conversion``).
    173   -  ``-fsanitize=vla-bound``: A variable-length array whose bound
    174      does not evaluate to a positive value.
    175   -  ``-fsanitize=vptr``: Use of an object whose vptr indicates that it is of
    176      the wrong dynamic type, or that its lifetime has not begun or has ended.
    177      Incompatible with ``-fno-rtti``. Link must be performed by ``clang++``, not
    178      ``clang``, to make sure C++-specific parts of the runtime library and C++
    179      standard libraries are present.
    180 
    181 You can also use the following check groups:
    182   -  ``-fsanitize=undefined``: All of the checks listed above other than
    183      ``float-divide-by-zero``, ``unsigned-integer-overflow``,
    184      ``implicit-conversion``, ``local-bounds`` and the ``nullability-*`` group
    185      of checks.
    186   -  ``-fsanitize=undefined-trap``: Deprecated alias of
    187      ``-fsanitize=undefined``.
    188   -  ``-fsanitize=implicit-integer-truncation``: Catches lossy integral
    189      conversions. Enables ``implicit-signed-integer-truncation`` and
    190      ``implicit-unsigned-integer-truncation``.
    191   -  ``-fsanitize=implicit-integer-arithmetic-value-change``: Catches implicit
    192      conversions that change the arithmetic value of the integer. Enables
    193      ``implicit-signed-integer-truncation`` and ``implicit-integer-sign-change``.
    194   -  ``-fsanitize=implicit-conversion``: Checks for suspicious
    195      behavior of implicit conversions. Enables
    196      ``implicit-unsigned-integer-truncation``,
    197      ``implicit-signed-integer-truncation``, and
    198      ``implicit-integer-sign-change``.
    199   -  ``-fsanitize=integer``: Checks for undefined or suspicious integer
    200      behavior (e.g. unsigned integer overflow).
    201      Enables ``signed-integer-overflow``, ``unsigned-integer-overflow``,
    202      ``shift``, ``integer-divide-by-zero``,
    203      ``implicit-unsigned-integer-truncation``,
    204      ``implicit-signed-integer-truncation``, and
    205      ``implicit-integer-sign-change``.
    206   -  ``-fsanitize=nullability``: Enables ``nullability-arg``,
    207      ``nullability-assign``, and ``nullability-return``. While violating
    208      nullability does not have undefined behavior, it is often unintentional,
    209      so UBSan offers to catch it.
    210 
    211 Volatile
    212 --------
    213 
    214 The ``null``, ``alignment``, ``object-size``, ``local-bounds``, and ``vptr`` checks do not apply
    215 to pointers to types with the ``volatile`` qualifier.
    216 
    217 Minimal Runtime
    218 ===============
    219 
    220 There is a minimal UBSan runtime available suitable for use in production
    221 environments. This runtime has a small attack surface. It only provides very
    222 basic issue logging and deduplication, and does not support
    223 ``-fsanitize=function`` and ``-fsanitize=vptr`` checking.
    224 
    225 To use the minimal runtime, add ``-fsanitize-minimal-runtime`` to the clang
    226 command line options. For example, if you're used to compiling with
    227 ``-fsanitize=undefined``, you could enable the minimal runtime with
    228 ``-fsanitize=undefined -fsanitize-minimal-runtime``.
    229 
    230 Stack traces and report symbolization
    231 =====================================
    232 If you want UBSan to print symbolized stack trace for each error report, you
    233 will need to:
    234 
    235 #. Compile with ``-g`` and ``-fno-omit-frame-pointer`` to get proper debug
    236    information in your binary.
    237 #. Run your program with environment variable
    238    ``UBSAN_OPTIONS=print_stacktrace=1``.
    239 #. Make sure ``llvm-symbolizer`` binary is in ``PATH``.
    240 
    241 Logging
    242 =======
    243 
    244 The default log file for diagnostics is "stderr". To log diagnostics to another
    245 file, you can set ``UBSAN_OPTIONS=log_path=...``.
    246 
    247 Silencing Unsigned Integer Overflow
    248 ===================================
    249 To silence reports from unsigned integer overflow, you can set
    250 ``UBSAN_OPTIONS=silence_unsigned_overflow=1``.  This feature, combined with
    251 ``-fsanitize-recover=unsigned-integer-overflow``, is particularly useful for
    252 providing fuzzing signal without blowing up logs.
    253 
    254 Issue Suppression
    255 =================
    256 
    257 UndefinedBehaviorSanitizer is not expected to produce false positives.
    258 If you see one, look again; most likely it is a true positive!
    259 
    260 Disabling Instrumentation with ``__attribute__((no_sanitize("undefined")))``
    261 ----------------------------------------------------------------------------
    262 
    263 You disable UBSan checks for particular functions with
    264 ``__attribute__((no_sanitize("undefined")))``. You can use all values of
    265 ``-fsanitize=`` flag in this attribute, e.g. if your function deliberately
    266 contains possible signed integer overflow, you can use
    267 ``__attribute__((no_sanitize("signed-integer-overflow")))``.
    268 
    269 This attribute may not be
    270 supported by other compilers, so consider using it together with
    271 ``#if defined(__clang__)``.
    272 
    273 Suppressing Errors in Recompiled Code (Ignorelist)
    274 --------------------------------------------------
    275 
    276 UndefinedBehaviorSanitizer supports ``src`` and ``fun`` entity types in
    277 :doc:`SanitizerSpecialCaseList`, that can be used to suppress error reports
    278 in the specified source files or functions.
    279 
    280 Runtime suppressions
    281 --------------------
    282 
    283 Sometimes you can suppress UBSan error reports for specific files, functions,
    284 or libraries without recompiling the code. You need to pass a path to
    285 suppression file in a ``UBSAN_OPTIONS`` environment variable.
    286 
    287 .. code-block:: bash
    288 
    289     UBSAN_OPTIONS=suppressions=MyUBSan.supp
    290 
    291 You need to specify a :ref:`check <ubsan-checks>` you are suppressing and the
    292 bug location. For example:
    293 
    294 .. code-block:: bash
    295 
    296   signed-integer-overflow:file-with-known-overflow.cpp
    297   alignment:function_doing_unaligned_access
    298   vptr:shared_object_with_vptr_failures.so
    299 
    300 There are several limitations:
    301 
    302 * Sometimes your binary must have enough debug info and/or symbol table, so
    303   that the runtime could figure out source file or function name to match
    304   against the suppression.
    305 * It is only possible to suppress recoverable checks. For the example above,
    306   you can additionally pass
    307   ``-fsanitize-recover=signed-integer-overflow,alignment,vptr``, although
    308   most of UBSan checks are recoverable by default.
    309 * Check groups (like ``undefined``) can't be used in suppressions file, only
    310   fine-grained checks are supported.
    311 
    312 Supported Platforms
    313 ===================
    314 
    315 UndefinedBehaviorSanitizer is supported on the following operating systems:
    316 
    317 * Android
    318 * Linux
    319 * NetBSD
    320 * FreeBSD
    321 * OpenBSD
    322 * macOS
    323 * Windows
    324 
    325 The runtime library is relatively portable and platform independent. If the OS
    326 you need is not listed above, UndefinedBehaviorSanitizer may already work for
    327 it, or could be made to work with a minor porting effort.
    328 
    329 Current Status
    330 ==============
    331 
    332 UndefinedBehaviorSanitizer is available on selected platforms starting from LLVM
    333 3.3. The test suite is integrated into the CMake build and can be run with
    334 ``check-ubsan`` command.
    335 
    336 Additional Configuration
    337 ========================
    338 
    339 UndefinedBehaviorSanitizer adds static check data for each check unless it is
    340 in trap mode. This check data includes the full file name. The option
    341 ``-fsanitize-undefined-strip-path-components=N`` can be used to trim this
    342 information. If ``N`` is positive, file information emitted by
    343 UndefinedBehaviorSanitizer will drop the first ``N`` components from the file
    344 path. If ``N`` is negative, the last ``N`` components will be kept.
    345 
    346 Example
    347 -------
    348 
    349 For a file called ``/code/library/file.cpp``, here is what would be emitted:
    350 
    351 * Default (No flag, or ``-fsanitize-undefined-strip-path-components=0``): ``/code/library/file.cpp``
    352 * ``-fsanitize-undefined-strip-path-components=1``: ``code/library/file.cpp``
    353 * ``-fsanitize-undefined-strip-path-components=2``: ``library/file.cpp``
    354 * ``-fsanitize-undefined-strip-path-components=-1``: ``file.cpp``
    355 * ``-fsanitize-undefined-strip-path-components=-2``: ``library/file.cpp``
    356 
    357 More Information
    358 ================
    359 
    360 * From LLVM project blog:
    361   `What Every C Programmer Should Know About Undefined Behavior
    362   <http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html>`_
    363 * From John Regehr's *Embedded in Academia* blog:
    364   `A Guide to Undefined Behavior in C and C++
    365   <https://blog.regehr.org/archives/213>`_
    366