Home | History | Annotate | Line # | Download | only in doc
      1  1.1.1.2  mrg @c Copyright (C) 2019-2022 Free Software Foundation, Inc.
      2      1.1  mrg @c This is part of the GCC manual.
      3      1.1  mrg @c For copying conditions, see the file gcc.texi.
      4      1.1  mrg @c Contributed by David Malcolm <dmalcolm (a] redhat.com>.
      5      1.1  mrg 
      6      1.1  mrg @node Static Analyzer
      7      1.1  mrg @chapter Static Analyzer
      8      1.1  mrg @cindex analyzer
      9      1.1  mrg @cindex static analysis
     10      1.1  mrg @cindex static analyzer
     11      1.1  mrg 
     12      1.1  mrg @menu
     13      1.1  mrg * Analyzer Internals::       Analyzer Internals
     14      1.1  mrg * Debugging the Analyzer::   Useful debugging tips
     15      1.1  mrg @end menu
     16      1.1  mrg 
     17      1.1  mrg @node Analyzer Internals
     18      1.1  mrg @section Analyzer Internals
     19      1.1  mrg @cindex analyzer, internals
     20      1.1  mrg @cindex static analyzer, internals
     21      1.1  mrg 
     22      1.1  mrg @subsection Overview
     23      1.1  mrg 
     24      1.1  mrg The analyzer implementation works on the gimple-SSA representation.
     25      1.1  mrg (I chose this in the hopes of making it easy to work with LTO to
     26      1.1  mrg do whole-program analysis).
     27      1.1  mrg 
     28      1.1  mrg The implementation is read-only: it doesn't attempt to change anything,
     29      1.1  mrg just emit warnings.
     30      1.1  mrg 
     31      1.1  mrg The gimple representation can be seen using @option{-fdump-ipa-analyzer}.
     32  1.1.1.2  mrg @quotation Tip
     33  1.1.1.2  mrg If the analyzer ICEs before this is written out, one workaround is to use
     34  1.1.1.2  mrg @option{--param=analyzer-bb-explosion-factor=0} to force the analyzer
     35  1.1.1.2  mrg to bail out after analyzing the first basic block.
     36  1.1.1.2  mrg @end quotation
     37      1.1  mrg 
     38      1.1  mrg First, we build a @code{supergraph} which combines the callgraph and all
     39      1.1  mrg of the CFGs into a single directed graph, with both interprocedural and
     40      1.1  mrg intraprocedural edges.  The nodes and edges in the supergraph are called
     41      1.1  mrg ``supernodes'' and ``superedges'', and often referred to in code as
     42      1.1  mrg @code{snodes} and @code{sedges}.  Basic blocks in the CFGs are split at
     43      1.1  mrg interprocedural calls, so there can be more than one supernode per
     44      1.1  mrg basic block.  Most statements will be in just one supernode, but a call
     45      1.1  mrg statement can appear in two supernodes: at the end of one for the call,
     46      1.1  mrg and again at the start of another for the return.
     47      1.1  mrg 
     48      1.1  mrg The supergraph can be seen using @option{-fdump-analyzer-supergraph}.
     49      1.1  mrg 
     50      1.1  mrg We then build an @code{analysis_plan} which walks the callgraph to
     51      1.1  mrg determine which calls might be suitable for being summarized (rather
     52      1.1  mrg than fully explored) and thus in what order to explore the functions.
     53      1.1  mrg 
     54      1.1  mrg Next is the heart of the analyzer: we use a worklist to explore state
     55      1.1  mrg within the supergraph, building an "exploded graph".
     56      1.1  mrg Nodes in the exploded graph correspond to <point,@w{ }state> pairs, as in
     57      1.1  mrg      "Precise Interprocedural Dataflow Analysis via Graph Reachability"
     58      1.1  mrg      (Thomas Reps, Susan Horwitz and Mooly Sagiv).
     59      1.1  mrg 
     60      1.1  mrg We reuse nodes for <point, state> pairs we've already seen, and avoid
     61      1.1  mrg tracking state too closely, so that (hopefully) we rapidly converge
     62      1.1  mrg on a final exploded graph, and terminate the analysis.  We also bail
     63      1.1  mrg out if the number of exploded <end-of-basic-block, state> nodes gets
     64      1.1  mrg larger than a particular multiple of the total number of basic blocks
     65      1.1  mrg (to ensure termination in the face of pathological state-explosion
     66      1.1  mrg cases, or bugs).  We also stop exploring a point once we hit a limit
     67      1.1  mrg of states for that point.
     68      1.1  mrg 
     69      1.1  mrg We can identify problems directly when processing a <point,@w{ }state>
     70      1.1  mrg instance.  For example, if we're finding the successors of
     71      1.1  mrg 
     72      1.1  mrg @smallexample
     73      1.1  mrg    <point: before-stmt: "free (ptr);",
     74      1.1  mrg     state: @{"ptr": freed@}>
     75      1.1  mrg @end smallexample
     76      1.1  mrg 
     77      1.1  mrg then we can detect a double-free of "ptr".  We can then emit a path
     78      1.1  mrg to reach the problem by finding the simplest route through the graph.
     79      1.1  mrg 
     80      1.1  mrg Program points in the analysis are much more fine-grained than in the
     81      1.1  mrg CFG and supergraph, with points (and thus potentially exploded nodes)
     82      1.1  mrg for various events, including before individual statements.
     83      1.1  mrg By default the exploded graph merges multiple consecutive statements
     84      1.1  mrg in a supernode into one exploded edge to minimize the size of the
     85      1.1  mrg exploded graph.  This can be suppressed via
     86      1.1  mrg @option{-fanalyzer-fine-grained}.
     87      1.1  mrg The fine-grained approach seems to make things simpler and more debuggable
     88      1.1  mrg that other approaches I tried, in that each point is responsible for one
     89      1.1  mrg thing.
     90      1.1  mrg 
     91      1.1  mrg Program points in the analysis also have a "call string" identifying the
     92      1.1  mrg stack of callsites below them, so that paths in the exploded graph
     93      1.1  mrg correspond to interprocedurally valid paths: we always return to the
     94      1.1  mrg correct call site, propagating state information accordingly.
     95      1.1  mrg We avoid infinite recursion by stopping the analysis if a callsite
     96      1.1  mrg appears more than @code{analyzer-max-recursion-depth} in a callstring
     97      1.1  mrg (defaulting to 2).
     98      1.1  mrg 
     99      1.1  mrg @subsection Graphs
    100      1.1  mrg 
    101      1.1  mrg Nodes and edges in the exploded graph are called ``exploded nodes'' and
    102      1.1  mrg ``exploded edges'' and often referred to in the code as
    103      1.1  mrg @code{enodes} and @code{eedges} (especially when distinguishing them
    104      1.1  mrg from the @code{snodes} and @code{sedges} in the supergraph).
    105      1.1  mrg 
    106      1.1  mrg Each graph numbers its nodes, giving unique identifiers - supernodes
    107      1.1  mrg are referred to throughout dumps in the form @samp{SN': @var{index}} and
    108      1.1  mrg exploded nodes in the form @samp{EN: @var{index}} (e.g. @samp{SN: 2} and
    109      1.1  mrg @samp{EN:29}).
    110      1.1  mrg 
    111      1.1  mrg The supergraph can be seen using @option{-fdump-analyzer-supergraph-graph}.
    112      1.1  mrg 
    113      1.1  mrg The exploded graph can be seen using @option{-fdump-analyzer-exploded-graph}
    114      1.1  mrg and other dump options.  Exploded nodes are color-coded in the .dot output
    115      1.1  mrg based on state-machine states to make it easier to see state changes at
    116      1.1  mrg a glance.
    117      1.1  mrg 
    118      1.1  mrg @subsection State Tracking
    119      1.1  mrg 
    120      1.1  mrg There's a tension between:
    121      1.1  mrg @itemize @bullet
    122      1.1  mrg @item
    123      1.1  mrg precision of analysis in the straight-line case, vs
    124      1.1  mrg @item
    125      1.1  mrg exponential blow-up in the face of control flow.
    126      1.1  mrg @end itemize
    127      1.1  mrg 
    128      1.1  mrg For example, in general, given this CFG:
    129      1.1  mrg 
    130      1.1  mrg @smallexample
    131      1.1  mrg       A
    132      1.1  mrg      / \
    133      1.1  mrg     B   C
    134      1.1  mrg      \ /
    135      1.1  mrg       D
    136      1.1  mrg      / \
    137      1.1  mrg     E   F
    138      1.1  mrg      \ /
    139      1.1  mrg       G
    140      1.1  mrg @end smallexample
    141      1.1  mrg 
    142      1.1  mrg we want to avoid differences in state-tracking in B and C from
    143      1.1  mrg leading to blow-up.  If we don't prevent state blowup, we end up
    144      1.1  mrg with exponential growth of the exploded graph like this:
    145      1.1  mrg 
    146      1.1  mrg @smallexample
    147      1.1  mrg 
    148      1.1  mrg            1:A
    149      1.1  mrg           /   \
    150      1.1  mrg          /     \
    151      1.1  mrg         /       \
    152      1.1  mrg       2:B       3:C
    153      1.1  mrg        |         |
    154      1.1  mrg       4:D       5:D        (2 exploded nodes for D)
    155      1.1  mrg      /   \     /   \
    156      1.1  mrg    6:E   7:F 8:E   9:F
    157      1.1  mrg     |     |   |     |
    158      1.1  mrg    10:G 11:G 12:G  13:G    (4 exploded nodes for G)
    159      1.1  mrg 
    160      1.1  mrg @end smallexample
    161      1.1  mrg 
    162      1.1  mrg Similar issues arise with loops.
    163      1.1  mrg 
    164      1.1  mrg To prevent this, we follow various approaches:
    165      1.1  mrg 
    166      1.1  mrg @enumerate a
    167      1.1  mrg @item
    168      1.1  mrg state pruning: which tries to discard state that won't be relevant
    169      1.1  mrg later on withing the function.
    170      1.1  mrg This can be disabled via @option{-fno-analyzer-state-purge}.
    171      1.1  mrg 
    172      1.1  mrg @item
    173      1.1  mrg state merging.  We can try to find the commonality between two
    174      1.1  mrg program_state instances to make a third, simpler program_state.
    175      1.1  mrg We have two strategies here:
    176      1.1  mrg 
    177      1.1  mrg   @enumerate
    178      1.1  mrg   @item
    179      1.1  mrg      the worklist keeps new nodes for the same program_point together,
    180      1.1  mrg      and tries to merge them before processing, and thus before they have
    181      1.1  mrg      successors.  Hence, in the above, the two nodes for D (4 and 5) reach
    182      1.1  mrg      the front of the worklist together, and we create a node for D with
    183      1.1  mrg      the merger of the incoming states.
    184      1.1  mrg 
    185      1.1  mrg   @item
    186      1.1  mrg      try merging with the state of existing enodes for the program_point
    187      1.1  mrg      (which may have already been explored).  There will be duplication,
    188      1.1  mrg      but only one set of duplication; subsequent duplicates are more likely
    189      1.1  mrg      to hit the cache.  In particular, (hopefully) all merger chains are
    190      1.1  mrg      finite, and so we guarantee termination.
    191      1.1  mrg      This is intended to help with loops: we ought to explore the first
    192      1.1  mrg      iteration, and then have a "subsequent iterations" exploration,
    193      1.1  mrg      which uses a state merged from that of the first, to be more abstract.
    194      1.1  mrg   @end enumerate
    195      1.1  mrg 
    196      1.1  mrg We avoid merging pairs of states that have state-machine differences,
    197      1.1  mrg as these are the kinds of differences that are likely to be most
    198      1.1  mrg interesting.  So, for example, given:
    199      1.1  mrg 
    200      1.1  mrg @smallexample
    201      1.1  mrg       if (condition)
    202      1.1  mrg         ptr = malloc (size);
    203      1.1  mrg       else
    204      1.1  mrg         ptr = local_buf;
    205      1.1  mrg 
    206      1.1  mrg       .... do things with 'ptr'
    207      1.1  mrg 
    208      1.1  mrg       if (condition)
    209      1.1  mrg         free (ptr);
    210      1.1  mrg 
    211      1.1  mrg       ...etc
    212      1.1  mrg @end smallexample
    213      1.1  mrg 
    214      1.1  mrg then we end up with an exploded graph that looks like this:
    215      1.1  mrg 
    216      1.1  mrg @smallexample
    217      1.1  mrg 
    218      1.1  mrg                    if (condition)
    219      1.1  mrg                      / T      \ F
    220      1.1  mrg             ---------          ----------
    221      1.1  mrg            /                             \
    222      1.1  mrg       ptr = malloc (size)             ptr = local_buf
    223      1.1  mrg           |                               |
    224      1.1  mrg       copy of                         copy of
    225      1.1  mrg         "do things with 'ptr'"          "do things with 'ptr'"
    226      1.1  mrg       with ptr: heap-allocated        with ptr: stack-allocated
    227      1.1  mrg           |                               |
    228      1.1  mrg       if (condition)                  if (condition)
    229      1.1  mrg           | known to be T                 | known to be F
    230      1.1  mrg       free (ptr);                         |
    231      1.1  mrg            \                             /
    232      1.1  mrg             -----------------------------
    233      1.1  mrg                          | ('ptr' is pruned, so states can be merged)
    234      1.1  mrg                         etc
    235      1.1  mrg 
    236      1.1  mrg @end smallexample
    237      1.1  mrg 
    238      1.1  mrg where some duplication has occurred, but only for the places where the
    239      1.1  mrg the different paths are worth exploringly separately.
    240      1.1  mrg 
    241      1.1  mrg Merging can be disabled via @option{-fno-analyzer-state-merge}.
    242      1.1  mrg @end enumerate
    243      1.1  mrg 
    244      1.1  mrg @subsection Region Model
    245      1.1  mrg 
    246      1.1  mrg Part of the state stored at a @code{exploded_node} is a @code{region_model}.
    247      1.1  mrg This is an implementation of the region-based ternary model described in
    248  1.1.1.2  mrg @url{https://www.researchgate.net/publication/221430855_A_Memory_Model_for_Static_Analysis_of_C_Programs,
    249      1.1  mrg "A Memory Model for Static Analysis of C Programs"}
    250      1.1  mrg (Zhongxing Xu, Ted Kremenek, and Jian Zhang).
    251      1.1  mrg 
    252      1.1  mrg A @code{region_model} encapsulates a representation of the state of
    253  1.1.1.2  mrg memory, with a @code{store} recording a binding between @code{region}
    254  1.1.1.2  mrg instances, to @code{svalue} instances.  The bindings are organized into
    255  1.1.1.2  mrg clusters, where regions accessible via well-defined pointer arithmetic
    256  1.1.1.2  mrg are in the same cluster.  The representation is graph-like because values
    257  1.1.1.2  mrg can be pointers to regions.  It also stores a constraint_manager,
    258  1.1.1.2  mrg capturing relationships between the values.
    259      1.1  mrg 
    260      1.1  mrg Because each node in the @code{exploded_graph} has a @code{region_model},
    261      1.1  mrg and each of the latter is graph-like, the @code{exploded_graph} is in some
    262      1.1  mrg ways a graph of graphs.
    263      1.1  mrg 
    264  1.1.1.2  mrg Here's an example of printing a @code{program_state}, showing the
    265  1.1.1.2  mrg @code{region_model} within it, along with state for the @code{malloc}
    266  1.1.1.2  mrg state machine.
    267      1.1  mrg 
    268      1.1  mrg @smallexample
    269      1.1  mrg (gdb) call debug (*this)
    270  1.1.1.2  mrg rmodel:
    271  1.1.1.2  mrg stack depth: 1
    272  1.1.1.2  mrg   frame (index 0): frame: test@@1
    273  1.1.1.2  mrg clusters within frame: test@@1
    274  1.1.1.2  mrg   cluster for: ptr_3: &HEAP_ALLOCATED_REGION(12)
    275  1.1.1.2  mrg m_called_unknown_fn: FALSE
    276  1.1.1.2  mrg constraint_manager:
    277      1.1  mrg   equiv classes:
    278      1.1  mrg   constraints:
    279  1.1.1.2  mrg malloc:
    280  1.1.1.2  mrg   0x2e89590: &HEAP_ALLOCATED_REGION(12): unchecked ('ptr_3')
    281      1.1  mrg @end smallexample
    282      1.1  mrg 
    283      1.1  mrg This is the state at the point of returning from @code{calls_malloc} back
    284      1.1  mrg to @code{test} in the following:
    285      1.1  mrg 
    286      1.1  mrg @smallexample
    287      1.1  mrg void *
    288      1.1  mrg calls_malloc (void)
    289      1.1  mrg @{
    290      1.1  mrg   void *result = malloc (1024);
    291      1.1  mrg   return result;
    292      1.1  mrg @}
    293      1.1  mrg 
    294      1.1  mrg void test (void)
    295      1.1  mrg @{
    296      1.1  mrg   void *ptr = calls_malloc ();
    297      1.1  mrg   /* etc.  */
    298      1.1  mrg @}
    299      1.1  mrg @end smallexample
    300      1.1  mrg 
    301  1.1.1.2  mrg Within the store, there is the cluster for @code{ptr_3} within the frame
    302  1.1.1.2  mrg for @code{test}, where the whole cluster is bound to a pointer value,
    303  1.1.1.2  mrg pointing at @code{HEAP_ALLOCATED_REGION(12)}.  Additionally, this pointer
    304  1.1.1.2  mrg has the @code{unchecked} state for the @code{malloc} state machine
    305  1.1.1.2  mrg indicating it hasn't yet been checked against NULL since the allocation
    306  1.1.1.2  mrg call.
    307      1.1  mrg 
    308      1.1  mrg @subsection Analyzer Paths
    309      1.1  mrg 
    310      1.1  mrg We need to explain to the user what the problem is, and to persuade them
    311      1.1  mrg that there really is a problem.  Hence having a @code{diagnostic_path}
    312      1.1  mrg isn't just an incidental detail of the analyzer; it's required.
    313      1.1  mrg 
    314      1.1  mrg Paths ought to be:
    315      1.1  mrg @itemize @bullet
    316      1.1  mrg @item
    317      1.1  mrg interprocedurally-valid
    318      1.1  mrg @item
    319      1.1  mrg feasible
    320      1.1  mrg @end itemize
    321      1.1  mrg 
    322      1.1  mrg Without state-merging, all paths in the exploded graph are feasible
    323  1.1.1.2  mrg (in terms of constraints being satisfied).
    324      1.1  mrg With state-merging, paths in the exploded graph can be infeasible.
    325      1.1  mrg 
    326      1.1  mrg We collate warnings and only emit them for the simplest path
    327      1.1  mrg e.g. for a bug in a utility function, with lots of routes to calling it,
    328      1.1  mrg we only emit the simplest path (which could be intraprocedural, if
    329  1.1.1.2  mrg it can be reproduced without a caller).
    330  1.1.1.2  mrg 
    331  1.1.1.2  mrg We thus want to find the shortest feasible path through the exploded
    332  1.1.1.2  mrg graph from the origin to the exploded node at which the diagnostic was
    333  1.1.1.2  mrg saved.  Unfortunately, if we simply find the shortest such path and
    334  1.1.1.2  mrg check if it's feasible we might falsely reject the diagnostic, as there
    335  1.1.1.2  mrg might be a longer path that is feasible.  Examples include the cases
    336  1.1.1.2  mrg where the diagnostic requires us to go at least once around a loop for a
    337  1.1.1.2  mrg later condition to be satisfied, or where for a later condition to be
    338  1.1.1.2  mrg satisfied we need to enter a suite of code that the simpler path skips.
    339  1.1.1.2  mrg 
    340  1.1.1.2  mrg We attempt to find the shortest feasible path to each diagnostic by
    341  1.1.1.2  mrg first constructing a ``trimmed graph'' from the exploded graph,
    342  1.1.1.2  mrg containing only those nodes and edges from which there are paths to
    343  1.1.1.2  mrg the target node, and using Dijkstra's algorithm to order the trimmed
    344  1.1.1.2  mrg nodes by minimal distance to the target.
    345  1.1.1.2  mrg 
    346  1.1.1.2  mrg We then use a worklist to iteratively build a ``feasible graph''
    347  1.1.1.2  mrg (actually a tree), capturing the pertinent state along each path, in
    348  1.1.1.2  mrg which every path to a ``feasible node'' is feasible by construction,
    349  1.1.1.2  mrg restricting ourselves to the trimmed graph to ensure we stay on target,
    350  1.1.1.2  mrg and ordering the worklist so that the first feasible path we find to the
    351  1.1.1.2  mrg target node is the shortest possible path.  Hence we start by trying the
    352  1.1.1.2  mrg shortest possible path, but if that fails, we explore progressively
    353  1.1.1.2  mrg longer paths, eventually trying iterations through loops.  The
    354  1.1.1.2  mrg exploration is captured in the feasible_graph, which can be dumped as a
    355  1.1.1.2  mrg .dot file via @option{-fdump-analyzer-feasibility} to visualize the
    356  1.1.1.2  mrg exploration.  The indices of the feasible nodes show the order in which
    357  1.1.1.2  mrg they were created.  We effectively explore the tree of feasible paths in
    358  1.1.1.2  mrg order of shortest path until we either find a feasible path to the
    359  1.1.1.2  mrg target node, or hit a limit and give up.
    360  1.1.1.2  mrg 
    361  1.1.1.2  mrg This is something of a brute-force approach, but the trimmed graph
    362  1.1.1.2  mrg hopefully keeps the complexity manageable.
    363  1.1.1.2  mrg 
    364  1.1.1.2  mrg This algorithm can be disabled (for debugging purposes) via
    365  1.1.1.2  mrg @option{-fno-analyzer-feasibility}, which simply uses the shortest path,
    366  1.1.1.2  mrg and notes if it is infeasible.
    367  1.1.1.2  mrg 
    368  1.1.1.2  mrg The above gives us a shortest feasible @code{exploded_path} through the
    369  1.1.1.2  mrg @code{exploded_graph} (a list of @code{exploded_edge *}).  We use this
    370  1.1.1.2  mrg @code{exploded_path} to build a @code{diagnostic_path} (a list of
    371  1.1.1.2  mrg @strong{events} for the diagnostic subsystem) - specifically a
    372  1.1.1.2  mrg @code{checker_path}.
    373      1.1  mrg 
    374      1.1  mrg Having built the @code{checker_path}, we prune it to try to eliminate
    375      1.1  mrg events that aren't relevant, to minimize how much the user has to read.
    376      1.1  mrg 
    377      1.1  mrg After pruning, we notify each event in the path of its ID and record the
    378      1.1  mrg IDs of interesting events, allowing for events to refer to other events
    379      1.1  mrg in their descriptions.  The @code{pending_diagnostic} class has various
    380      1.1  mrg vfuncs to support emitting more precise descriptions, so that e.g.
    381      1.1  mrg 
    382      1.1  mrg @itemize @bullet
    383      1.1  mrg @item
    384      1.1  mrg a deref-of-unchecked-malloc diagnostic might use:
    385      1.1  mrg @smallexample
    386      1.1  mrg   returning possibly-NULL pointer to 'make_obj' from 'allocator'
    387      1.1  mrg @end smallexample
    388      1.1  mrg for a @code{return_event} to make it clearer how the unchecked value moves
    389      1.1  mrg from callee back to caller
    390      1.1  mrg @item
    391      1.1  mrg a double-free diagnostic might use:
    392      1.1  mrg @smallexample
    393      1.1  mrg   second 'free' here; first 'free' was at (3)
    394      1.1  mrg @end smallexample
    395      1.1  mrg and a use-after-free might use
    396      1.1  mrg @smallexample
    397      1.1  mrg   use after 'free' here; memory was freed at (2)
    398      1.1  mrg @end smallexample
    399      1.1  mrg @end itemize
    400      1.1  mrg 
    401      1.1  mrg At this point we can emit the diagnostic.
    402      1.1  mrg 
    403      1.1  mrg @subsection Limitations
    404      1.1  mrg 
    405      1.1  mrg @itemize @bullet
    406      1.1  mrg @item
    407      1.1  mrg Only for C so far
    408      1.1  mrg @item
    409      1.1  mrg The implementation of call summaries is currently very simplistic.
    410      1.1  mrg @item
    411      1.1  mrg Lack of function pointer analysis
    412      1.1  mrg @item
    413      1.1  mrg The constraint-handling code assumes reflexivity in some places
    414      1.1  mrg (that values are equal to themselves), which is not the case for NaN.
    415      1.1  mrg As a simple workaround, constraints on floating-point values are
    416      1.1  mrg currently ignored.
    417      1.1  mrg @item
    418      1.1  mrg There are various other limitations in the region model (grep for TODO/xfail
    419      1.1  mrg in the testsuite).
    420      1.1  mrg @item
    421      1.1  mrg The constraint_manager's implementation of transitivity is currently too
    422      1.1  mrg expensive to enable by default and so must be manually enabled via
    423      1.1  mrg @option{-fanalyzer-transitivity}).
    424      1.1  mrg @item
    425      1.1  mrg The checkers are currently hardcoded and don't allow for user extensibility
    426      1.1  mrg (e.g. adding allocate/release pairs).
    427      1.1  mrg @item
    428      1.1  mrg Although the analyzer's test suite has a proof-of-concept test case for
    429      1.1  mrg LTO, LTO support hasn't had extensive testing.  There are various
    430      1.1  mrg lang-specific things in the analyzer that assume C rather than LTO.
    431      1.1  mrg For example, SSA names are printed to the user in ``raw'' form, rather
    432      1.1  mrg than printing the underlying variable name.
    433      1.1  mrg @end itemize
    434      1.1  mrg 
    435      1.1  mrg @node Debugging the Analyzer
    436      1.1  mrg @section Debugging the Analyzer
    437      1.1  mrg @cindex analyzer, debugging
    438      1.1  mrg @cindex static analyzer, debugging
    439      1.1  mrg 
    440      1.1  mrg @subsection Special Functions for Debugging the Analyzer
    441      1.1  mrg 
    442      1.1  mrg The analyzer recognizes various special functions by name, for use
    443      1.1  mrg in debugging the analyzer.  Declarations can be seen in the testsuite
    444      1.1  mrg in @file{analyzer-decls.h}.  None of these functions are actually
    445      1.1  mrg implemented.
    446      1.1  mrg 
    447      1.1  mrg Add:
    448      1.1  mrg @smallexample
    449      1.1  mrg   __analyzer_break ();
    450      1.1  mrg @end smallexample
    451      1.1  mrg to the source being analyzed to trigger a breakpoint in the analyzer when
    452      1.1  mrg that source is reached.  By putting a series of these in the source, it's
    453      1.1  mrg much easier to effectively step through the program state as it's analyzed.
    454      1.1  mrg 
    455  1.1.1.2  mrg The analyzer handles:
    456  1.1.1.2  mrg 
    457  1.1.1.2  mrg @smallexample
    458  1.1.1.2  mrg __analyzer_describe (0, expr);
    459  1.1.1.2  mrg @end smallexample
    460  1.1.1.2  mrg 
    461  1.1.1.2  mrg by emitting a warning describing the 2nd argument (which can be of any
    462  1.1.1.2  mrg type), at a verbosity level given by the 1st argument.  This is for use when
    463  1.1.1.2  mrg debugging, and may be of use in DejaGnu tests.
    464  1.1.1.2  mrg 
    465      1.1  mrg @smallexample
    466      1.1  mrg __analyzer_dump ();
    467      1.1  mrg @end smallexample
    468      1.1  mrg 
    469      1.1  mrg will dump the copious information about the analyzer's state each time it
    470      1.1  mrg reaches the call in its traversal of the source.
    471      1.1  mrg 
    472      1.1  mrg @smallexample
    473  1.1.1.2  mrg extern void __analyzer_dump_capacity (const void *ptr);
    474  1.1.1.2  mrg @end smallexample
    475  1.1.1.2  mrg 
    476  1.1.1.2  mrg will emit a warning describing the capacity of the base region of
    477  1.1.1.2  mrg the region pointed to by the 1st argument.
    478  1.1.1.2  mrg 
    479  1.1.1.2  mrg @smallexample
    480  1.1.1.2  mrg extern void __analyzer_dump_escaped (void);
    481  1.1.1.2  mrg @end smallexample
    482  1.1.1.2  mrg 
    483  1.1.1.2  mrg will emit a warning giving the number of decls that have escaped on this
    484  1.1.1.2  mrg analysis path, followed by a comma-separated list of their names,
    485  1.1.1.2  mrg in alphabetical order.
    486  1.1.1.2  mrg 
    487  1.1.1.2  mrg @smallexample
    488      1.1  mrg __analyzer_dump_path ();
    489      1.1  mrg @end smallexample
    490      1.1  mrg 
    491      1.1  mrg will emit a placeholder ``note'' diagnostic with a path to that call site,
    492      1.1  mrg if the analyzer finds a feasible path to it.
    493      1.1  mrg 
    494      1.1  mrg The builtin @code{__analyzer_dump_exploded_nodes} will emit a warning
    495      1.1  mrg after analysis containing information on all of the exploded nodes at that
    496      1.1  mrg program point:
    497      1.1  mrg 
    498      1.1  mrg @smallexample
    499      1.1  mrg   __analyzer_dump_exploded_nodes (0);
    500      1.1  mrg @end smallexample
    501      1.1  mrg 
    502      1.1  mrg will output the number of ``processed'' nodes, and the IDs of
    503      1.1  mrg both ``processed'' and ``merger'' nodes, such as:
    504      1.1  mrg 
    505      1.1  mrg @smallexample
    506      1.1  mrg warning: 2 processed enodes: [EN: 56, EN: 58] merger(s): [EN: 54-55, EN: 57, EN: 59]
    507      1.1  mrg @end smallexample
    508      1.1  mrg 
    509      1.1  mrg With a non-zero argument
    510      1.1  mrg 
    511      1.1  mrg @smallexample
    512      1.1  mrg   __analyzer_dump_exploded_nodes (1);
    513      1.1  mrg @end smallexample
    514      1.1  mrg 
    515      1.1  mrg it will also dump all of the states within the ``processed'' nodes.
    516      1.1  mrg 
    517      1.1  mrg @smallexample
    518      1.1  mrg    __analyzer_dump_region_model ();
    519      1.1  mrg @end smallexample
    520      1.1  mrg will dump the region_model's state to stderr.
    521      1.1  mrg 
    522      1.1  mrg @smallexample
    523  1.1.1.2  mrg __analyzer_dump_state ("malloc", ptr);
    524  1.1.1.2  mrg @end smallexample
    525  1.1.1.2  mrg 
    526  1.1.1.2  mrg will emit a warning describing the state of the 2nd argument
    527  1.1.1.2  mrg (which can be of any type) with respect to the state machine with
    528  1.1.1.2  mrg a name matching the 1st argument (which must be a string literal).
    529  1.1.1.2  mrg This is for use when debugging, and may be of use in DejaGnu tests.
    530  1.1.1.2  mrg 
    531  1.1.1.2  mrg @smallexample
    532      1.1  mrg __analyzer_eval (expr);
    533      1.1  mrg @end smallexample
    534      1.1  mrg will emit a warning with text "TRUE", FALSE" or "UNKNOWN" based on the
    535      1.1  mrg truthfulness of the argument.  This is useful for writing DejaGnu tests.
    536      1.1  mrg 
    537      1.1  mrg 
    538      1.1  mrg @subsection Other Debugging Techniques
    539      1.1  mrg 
    540  1.1.1.2  mrg The option @option{-fdump-analyzer-json} will dump both the supergraph
    541  1.1.1.2  mrg and the exploded graph in compressed JSON form.
    542  1.1.1.2  mrg 
    543      1.1  mrg One approach when tracking down where a particular bogus state is
    544      1.1  mrg introduced into the @code{exploded_graph} is to add custom code to
    545  1.1.1.2  mrg @code{program_state::validate}.
    546      1.1  mrg 
    547  1.1.1.2  mrg The debug function @code{region::is_named_decl_p} can be used when debugging,
    548  1.1.1.2  mrg such as for assertions and conditional breakpoints.  For example, when
    549  1.1.1.2  mrg tracking down a bug in handling a decl called @code{yy_buffer_stack}, I
    550  1.1.1.2  mrg temporarily added a:
    551      1.1  mrg @smallexample
    552  1.1.1.2  mrg   gcc_assert (!m_base_region->is_named_decl_p ("yy_buffer_stack"));
    553      1.1  mrg @end smallexample
    554  1.1.1.2  mrg to @code{binding_cluster::mark_as_escaped} to trap a point where
    555  1.1.1.2  mrg @code{yy_buffer_stack} was mistakenly being treated as having escaped.
    556