1 1.1.1.2 mrg @c Copyright (C) 2019-2022 Free Software Foundation, Inc. 2 1.1 mrg @c This is part of the GCC manual. 3 1.1 mrg @c For copying conditions, see the file gcc.texi. 4 1.1 mrg @c Contributed by David Malcolm <dmalcolm (a] redhat.com>. 5 1.1 mrg 6 1.1 mrg @node Static Analyzer 7 1.1 mrg @chapter Static Analyzer 8 1.1 mrg @cindex analyzer 9 1.1 mrg @cindex static analysis 10 1.1 mrg @cindex static analyzer 11 1.1 mrg 12 1.1 mrg @menu 13 1.1 mrg * Analyzer Internals:: Analyzer Internals 14 1.1 mrg * Debugging the Analyzer:: Useful debugging tips 15 1.1 mrg @end menu 16 1.1 mrg 17 1.1 mrg @node Analyzer Internals 18 1.1 mrg @section Analyzer Internals 19 1.1 mrg @cindex analyzer, internals 20 1.1 mrg @cindex static analyzer, internals 21 1.1 mrg 22 1.1 mrg @subsection Overview 23 1.1 mrg 24 1.1 mrg The analyzer implementation works on the gimple-SSA representation. 25 1.1 mrg (I chose this in the hopes of making it easy to work with LTO to 26 1.1 mrg do whole-program analysis). 27 1.1 mrg 28 1.1 mrg The implementation is read-only: it doesn't attempt to change anything, 29 1.1 mrg just emit warnings. 30 1.1 mrg 31 1.1 mrg The gimple representation can be seen using @option{-fdump-ipa-analyzer}. 32 1.1.1.2 mrg @quotation Tip 33 1.1.1.2 mrg If the analyzer ICEs before this is written out, one workaround is to use 34 1.1.1.2 mrg @option{--param=analyzer-bb-explosion-factor=0} to force the analyzer 35 1.1.1.2 mrg to bail out after analyzing the first basic block. 36 1.1.1.2 mrg @end quotation 37 1.1 mrg 38 1.1 mrg First, we build a @code{supergraph} which combines the callgraph and all 39 1.1 mrg of the CFGs into a single directed graph, with both interprocedural and 40 1.1 mrg intraprocedural edges. The nodes and edges in the supergraph are called 41 1.1 mrg ``supernodes'' and ``superedges'', and often referred to in code as 42 1.1 mrg @code{snodes} and @code{sedges}. Basic blocks in the CFGs are split at 43 1.1 mrg interprocedural calls, so there can be more than one supernode per 44 1.1 mrg basic block. Most statements will be in just one supernode, but a call 45 1.1 mrg statement can appear in two supernodes: at the end of one for the call, 46 1.1 mrg and again at the start of another for the return. 47 1.1 mrg 48 1.1 mrg The supergraph can be seen using @option{-fdump-analyzer-supergraph}. 49 1.1 mrg 50 1.1 mrg We then build an @code{analysis_plan} which walks the callgraph to 51 1.1 mrg determine which calls might be suitable for being summarized (rather 52 1.1 mrg than fully explored) and thus in what order to explore the functions. 53 1.1 mrg 54 1.1 mrg Next is the heart of the analyzer: we use a worklist to explore state 55 1.1 mrg within the supergraph, building an "exploded graph". 56 1.1 mrg Nodes in the exploded graph correspond to <point,@w{ }state> pairs, as in 57 1.1 mrg "Precise Interprocedural Dataflow Analysis via Graph Reachability" 58 1.1 mrg (Thomas Reps, Susan Horwitz and Mooly Sagiv). 59 1.1 mrg 60 1.1 mrg We reuse nodes for <point, state> pairs we've already seen, and avoid 61 1.1 mrg tracking state too closely, so that (hopefully) we rapidly converge 62 1.1 mrg on a final exploded graph, and terminate the analysis. We also bail 63 1.1 mrg out if the number of exploded <end-of-basic-block, state> nodes gets 64 1.1 mrg larger than a particular multiple of the total number of basic blocks 65 1.1 mrg (to ensure termination in the face of pathological state-explosion 66 1.1 mrg cases, or bugs). We also stop exploring a point once we hit a limit 67 1.1 mrg of states for that point. 68 1.1 mrg 69 1.1 mrg We can identify problems directly when processing a <point,@w{ }state> 70 1.1 mrg instance. For example, if we're finding the successors of 71 1.1 mrg 72 1.1 mrg @smallexample 73 1.1 mrg <point: before-stmt: "free (ptr);", 74 1.1 mrg state: @{"ptr": freed@}> 75 1.1 mrg @end smallexample 76 1.1 mrg 77 1.1 mrg then we can detect a double-free of "ptr". We can then emit a path 78 1.1 mrg to reach the problem by finding the simplest route through the graph. 79 1.1 mrg 80 1.1 mrg Program points in the analysis are much more fine-grained than in the 81 1.1 mrg CFG and supergraph, with points (and thus potentially exploded nodes) 82 1.1 mrg for various events, including before individual statements. 83 1.1 mrg By default the exploded graph merges multiple consecutive statements 84 1.1 mrg in a supernode into one exploded edge to minimize the size of the 85 1.1 mrg exploded graph. This can be suppressed via 86 1.1 mrg @option{-fanalyzer-fine-grained}. 87 1.1 mrg The fine-grained approach seems to make things simpler and more debuggable 88 1.1 mrg that other approaches I tried, in that each point is responsible for one 89 1.1 mrg thing. 90 1.1 mrg 91 1.1 mrg Program points in the analysis also have a "call string" identifying the 92 1.1 mrg stack of callsites below them, so that paths in the exploded graph 93 1.1 mrg correspond to interprocedurally valid paths: we always return to the 94 1.1 mrg correct call site, propagating state information accordingly. 95 1.1 mrg We avoid infinite recursion by stopping the analysis if a callsite 96 1.1 mrg appears more than @code{analyzer-max-recursion-depth} in a callstring 97 1.1 mrg (defaulting to 2). 98 1.1 mrg 99 1.1 mrg @subsection Graphs 100 1.1 mrg 101 1.1 mrg Nodes and edges in the exploded graph are called ``exploded nodes'' and 102 1.1 mrg ``exploded edges'' and often referred to in the code as 103 1.1 mrg @code{enodes} and @code{eedges} (especially when distinguishing them 104 1.1 mrg from the @code{snodes} and @code{sedges} in the supergraph). 105 1.1 mrg 106 1.1 mrg Each graph numbers its nodes, giving unique identifiers - supernodes 107 1.1 mrg are referred to throughout dumps in the form @samp{SN': @var{index}} and 108 1.1 mrg exploded nodes in the form @samp{EN: @var{index}} (e.g. @samp{SN: 2} and 109 1.1 mrg @samp{EN:29}). 110 1.1 mrg 111 1.1 mrg The supergraph can be seen using @option{-fdump-analyzer-supergraph-graph}. 112 1.1 mrg 113 1.1 mrg The exploded graph can be seen using @option{-fdump-analyzer-exploded-graph} 114 1.1 mrg and other dump options. Exploded nodes are color-coded in the .dot output 115 1.1 mrg based on state-machine states to make it easier to see state changes at 116 1.1 mrg a glance. 117 1.1 mrg 118 1.1 mrg @subsection State Tracking 119 1.1 mrg 120 1.1 mrg There's a tension between: 121 1.1 mrg @itemize @bullet 122 1.1 mrg @item 123 1.1 mrg precision of analysis in the straight-line case, vs 124 1.1 mrg @item 125 1.1 mrg exponential blow-up in the face of control flow. 126 1.1 mrg @end itemize 127 1.1 mrg 128 1.1 mrg For example, in general, given this CFG: 129 1.1 mrg 130 1.1 mrg @smallexample 131 1.1 mrg A 132 1.1 mrg / \ 133 1.1 mrg B C 134 1.1 mrg \ / 135 1.1 mrg D 136 1.1 mrg / \ 137 1.1 mrg E F 138 1.1 mrg \ / 139 1.1 mrg G 140 1.1 mrg @end smallexample 141 1.1 mrg 142 1.1 mrg we want to avoid differences in state-tracking in B and C from 143 1.1 mrg leading to blow-up. If we don't prevent state blowup, we end up 144 1.1 mrg with exponential growth of the exploded graph like this: 145 1.1 mrg 146 1.1 mrg @smallexample 147 1.1 mrg 148 1.1 mrg 1:A 149 1.1 mrg / \ 150 1.1 mrg / \ 151 1.1 mrg / \ 152 1.1 mrg 2:B 3:C 153 1.1 mrg | | 154 1.1 mrg 4:D 5:D (2 exploded nodes for D) 155 1.1 mrg / \ / \ 156 1.1 mrg 6:E 7:F 8:E 9:F 157 1.1 mrg | | | | 158 1.1 mrg 10:G 11:G 12:G 13:G (4 exploded nodes for G) 159 1.1 mrg 160 1.1 mrg @end smallexample 161 1.1 mrg 162 1.1 mrg Similar issues arise with loops. 163 1.1 mrg 164 1.1 mrg To prevent this, we follow various approaches: 165 1.1 mrg 166 1.1 mrg @enumerate a 167 1.1 mrg @item 168 1.1 mrg state pruning: which tries to discard state that won't be relevant 169 1.1 mrg later on withing the function. 170 1.1 mrg This can be disabled via @option{-fno-analyzer-state-purge}. 171 1.1 mrg 172 1.1 mrg @item 173 1.1 mrg state merging. We can try to find the commonality between two 174 1.1 mrg program_state instances to make a third, simpler program_state. 175 1.1 mrg We have two strategies here: 176 1.1 mrg 177 1.1 mrg @enumerate 178 1.1 mrg @item 179 1.1 mrg the worklist keeps new nodes for the same program_point together, 180 1.1 mrg and tries to merge them before processing, and thus before they have 181 1.1 mrg successors. Hence, in the above, the two nodes for D (4 and 5) reach 182 1.1 mrg the front of the worklist together, and we create a node for D with 183 1.1 mrg the merger of the incoming states. 184 1.1 mrg 185 1.1 mrg @item 186 1.1 mrg try merging with the state of existing enodes for the program_point 187 1.1 mrg (which may have already been explored). There will be duplication, 188 1.1 mrg but only one set of duplication; subsequent duplicates are more likely 189 1.1 mrg to hit the cache. In particular, (hopefully) all merger chains are 190 1.1 mrg finite, and so we guarantee termination. 191 1.1 mrg This is intended to help with loops: we ought to explore the first 192 1.1 mrg iteration, and then have a "subsequent iterations" exploration, 193 1.1 mrg which uses a state merged from that of the first, to be more abstract. 194 1.1 mrg @end enumerate 195 1.1 mrg 196 1.1 mrg We avoid merging pairs of states that have state-machine differences, 197 1.1 mrg as these are the kinds of differences that are likely to be most 198 1.1 mrg interesting. So, for example, given: 199 1.1 mrg 200 1.1 mrg @smallexample 201 1.1 mrg if (condition) 202 1.1 mrg ptr = malloc (size); 203 1.1 mrg else 204 1.1 mrg ptr = local_buf; 205 1.1 mrg 206 1.1 mrg .... do things with 'ptr' 207 1.1 mrg 208 1.1 mrg if (condition) 209 1.1 mrg free (ptr); 210 1.1 mrg 211 1.1 mrg ...etc 212 1.1 mrg @end smallexample 213 1.1 mrg 214 1.1 mrg then we end up with an exploded graph that looks like this: 215 1.1 mrg 216 1.1 mrg @smallexample 217 1.1 mrg 218 1.1 mrg if (condition) 219 1.1 mrg / T \ F 220 1.1 mrg --------- ---------- 221 1.1 mrg / \ 222 1.1 mrg ptr = malloc (size) ptr = local_buf 223 1.1 mrg | | 224 1.1 mrg copy of copy of 225 1.1 mrg "do things with 'ptr'" "do things with 'ptr'" 226 1.1 mrg with ptr: heap-allocated with ptr: stack-allocated 227 1.1 mrg | | 228 1.1 mrg if (condition) if (condition) 229 1.1 mrg | known to be T | known to be F 230 1.1 mrg free (ptr); | 231 1.1 mrg \ / 232 1.1 mrg ----------------------------- 233 1.1 mrg | ('ptr' is pruned, so states can be merged) 234 1.1 mrg etc 235 1.1 mrg 236 1.1 mrg @end smallexample 237 1.1 mrg 238 1.1 mrg where some duplication has occurred, but only for the places where the 239 1.1 mrg the different paths are worth exploringly separately. 240 1.1 mrg 241 1.1 mrg Merging can be disabled via @option{-fno-analyzer-state-merge}. 242 1.1 mrg @end enumerate 243 1.1 mrg 244 1.1 mrg @subsection Region Model 245 1.1 mrg 246 1.1 mrg Part of the state stored at a @code{exploded_node} is a @code{region_model}. 247 1.1 mrg This is an implementation of the region-based ternary model described in 248 1.1.1.2 mrg @url{https://www.researchgate.net/publication/221430855_A_Memory_Model_for_Static_Analysis_of_C_Programs, 249 1.1 mrg "A Memory Model for Static Analysis of C Programs"} 250 1.1 mrg (Zhongxing Xu, Ted Kremenek, and Jian Zhang). 251 1.1 mrg 252 1.1 mrg A @code{region_model} encapsulates a representation of the state of 253 1.1.1.2 mrg memory, with a @code{store} recording a binding between @code{region} 254 1.1.1.2 mrg instances, to @code{svalue} instances. The bindings are organized into 255 1.1.1.2 mrg clusters, where regions accessible via well-defined pointer arithmetic 256 1.1.1.2 mrg are in the same cluster. The representation is graph-like because values 257 1.1.1.2 mrg can be pointers to regions. It also stores a constraint_manager, 258 1.1.1.2 mrg capturing relationships between the values. 259 1.1 mrg 260 1.1 mrg Because each node in the @code{exploded_graph} has a @code{region_model}, 261 1.1 mrg and each of the latter is graph-like, the @code{exploded_graph} is in some 262 1.1 mrg ways a graph of graphs. 263 1.1 mrg 264 1.1.1.2 mrg Here's an example of printing a @code{program_state}, showing the 265 1.1.1.2 mrg @code{region_model} within it, along with state for the @code{malloc} 266 1.1.1.2 mrg state machine. 267 1.1 mrg 268 1.1 mrg @smallexample 269 1.1 mrg (gdb) call debug (*this) 270 1.1.1.2 mrg rmodel: 271 1.1.1.2 mrg stack depth: 1 272 1.1.1.2 mrg frame (index 0): frame: test@@1 273 1.1.1.2 mrg clusters within frame: test@@1 274 1.1.1.2 mrg cluster for: ptr_3: &HEAP_ALLOCATED_REGION(12) 275 1.1.1.2 mrg m_called_unknown_fn: FALSE 276 1.1.1.2 mrg constraint_manager: 277 1.1 mrg equiv classes: 278 1.1 mrg constraints: 279 1.1.1.2 mrg malloc: 280 1.1.1.2 mrg 0x2e89590: &HEAP_ALLOCATED_REGION(12): unchecked ('ptr_3') 281 1.1 mrg @end smallexample 282 1.1 mrg 283 1.1 mrg This is the state at the point of returning from @code{calls_malloc} back 284 1.1 mrg to @code{test} in the following: 285 1.1 mrg 286 1.1 mrg @smallexample 287 1.1 mrg void * 288 1.1 mrg calls_malloc (void) 289 1.1 mrg @{ 290 1.1 mrg void *result = malloc (1024); 291 1.1 mrg return result; 292 1.1 mrg @} 293 1.1 mrg 294 1.1 mrg void test (void) 295 1.1 mrg @{ 296 1.1 mrg void *ptr = calls_malloc (); 297 1.1 mrg /* etc. */ 298 1.1 mrg @} 299 1.1 mrg @end smallexample 300 1.1 mrg 301 1.1.1.2 mrg Within the store, there is the cluster for @code{ptr_3} within the frame 302 1.1.1.2 mrg for @code{test}, where the whole cluster is bound to a pointer value, 303 1.1.1.2 mrg pointing at @code{HEAP_ALLOCATED_REGION(12)}. Additionally, this pointer 304 1.1.1.2 mrg has the @code{unchecked} state for the @code{malloc} state machine 305 1.1.1.2 mrg indicating it hasn't yet been checked against NULL since the allocation 306 1.1.1.2 mrg call. 307 1.1 mrg 308 1.1 mrg @subsection Analyzer Paths 309 1.1 mrg 310 1.1 mrg We need to explain to the user what the problem is, and to persuade them 311 1.1 mrg that there really is a problem. Hence having a @code{diagnostic_path} 312 1.1 mrg isn't just an incidental detail of the analyzer; it's required. 313 1.1 mrg 314 1.1 mrg Paths ought to be: 315 1.1 mrg @itemize @bullet 316 1.1 mrg @item 317 1.1 mrg interprocedurally-valid 318 1.1 mrg @item 319 1.1 mrg feasible 320 1.1 mrg @end itemize 321 1.1 mrg 322 1.1 mrg Without state-merging, all paths in the exploded graph are feasible 323 1.1.1.2 mrg (in terms of constraints being satisfied). 324 1.1 mrg With state-merging, paths in the exploded graph can be infeasible. 325 1.1 mrg 326 1.1 mrg We collate warnings and only emit them for the simplest path 327 1.1 mrg e.g. for a bug in a utility function, with lots of routes to calling it, 328 1.1 mrg we only emit the simplest path (which could be intraprocedural, if 329 1.1.1.2 mrg it can be reproduced without a caller). 330 1.1.1.2 mrg 331 1.1.1.2 mrg We thus want to find the shortest feasible path through the exploded 332 1.1.1.2 mrg graph from the origin to the exploded node at which the diagnostic was 333 1.1.1.2 mrg saved. Unfortunately, if we simply find the shortest such path and 334 1.1.1.2 mrg check if it's feasible we might falsely reject the diagnostic, as there 335 1.1.1.2 mrg might be a longer path that is feasible. Examples include the cases 336 1.1.1.2 mrg where the diagnostic requires us to go at least once around a loop for a 337 1.1.1.2 mrg later condition to be satisfied, or where for a later condition to be 338 1.1.1.2 mrg satisfied we need to enter a suite of code that the simpler path skips. 339 1.1.1.2 mrg 340 1.1.1.2 mrg We attempt to find the shortest feasible path to each diagnostic by 341 1.1.1.2 mrg first constructing a ``trimmed graph'' from the exploded graph, 342 1.1.1.2 mrg containing only those nodes and edges from which there are paths to 343 1.1.1.2 mrg the target node, and using Dijkstra's algorithm to order the trimmed 344 1.1.1.2 mrg nodes by minimal distance to the target. 345 1.1.1.2 mrg 346 1.1.1.2 mrg We then use a worklist to iteratively build a ``feasible graph'' 347 1.1.1.2 mrg (actually a tree), capturing the pertinent state along each path, in 348 1.1.1.2 mrg which every path to a ``feasible node'' is feasible by construction, 349 1.1.1.2 mrg restricting ourselves to the trimmed graph to ensure we stay on target, 350 1.1.1.2 mrg and ordering the worklist so that the first feasible path we find to the 351 1.1.1.2 mrg target node is the shortest possible path. Hence we start by trying the 352 1.1.1.2 mrg shortest possible path, but if that fails, we explore progressively 353 1.1.1.2 mrg longer paths, eventually trying iterations through loops. The 354 1.1.1.2 mrg exploration is captured in the feasible_graph, which can be dumped as a 355 1.1.1.2 mrg .dot file via @option{-fdump-analyzer-feasibility} to visualize the 356 1.1.1.2 mrg exploration. The indices of the feasible nodes show the order in which 357 1.1.1.2 mrg they were created. We effectively explore the tree of feasible paths in 358 1.1.1.2 mrg order of shortest path until we either find a feasible path to the 359 1.1.1.2 mrg target node, or hit a limit and give up. 360 1.1.1.2 mrg 361 1.1.1.2 mrg This is something of a brute-force approach, but the trimmed graph 362 1.1.1.2 mrg hopefully keeps the complexity manageable. 363 1.1.1.2 mrg 364 1.1.1.2 mrg This algorithm can be disabled (for debugging purposes) via 365 1.1.1.2 mrg @option{-fno-analyzer-feasibility}, which simply uses the shortest path, 366 1.1.1.2 mrg and notes if it is infeasible. 367 1.1.1.2 mrg 368 1.1.1.2 mrg The above gives us a shortest feasible @code{exploded_path} through the 369 1.1.1.2 mrg @code{exploded_graph} (a list of @code{exploded_edge *}). We use this 370 1.1.1.2 mrg @code{exploded_path} to build a @code{diagnostic_path} (a list of 371 1.1.1.2 mrg @strong{events} for the diagnostic subsystem) - specifically a 372 1.1.1.2 mrg @code{checker_path}. 373 1.1 mrg 374 1.1 mrg Having built the @code{checker_path}, we prune it to try to eliminate 375 1.1 mrg events that aren't relevant, to minimize how much the user has to read. 376 1.1 mrg 377 1.1 mrg After pruning, we notify each event in the path of its ID and record the 378 1.1 mrg IDs of interesting events, allowing for events to refer to other events 379 1.1 mrg in their descriptions. The @code{pending_diagnostic} class has various 380 1.1 mrg vfuncs to support emitting more precise descriptions, so that e.g. 381 1.1 mrg 382 1.1 mrg @itemize @bullet 383 1.1 mrg @item 384 1.1 mrg a deref-of-unchecked-malloc diagnostic might use: 385 1.1 mrg @smallexample 386 1.1 mrg returning possibly-NULL pointer to 'make_obj' from 'allocator' 387 1.1 mrg @end smallexample 388 1.1 mrg for a @code{return_event} to make it clearer how the unchecked value moves 389 1.1 mrg from callee back to caller 390 1.1 mrg @item 391 1.1 mrg a double-free diagnostic might use: 392 1.1 mrg @smallexample 393 1.1 mrg second 'free' here; first 'free' was at (3) 394 1.1 mrg @end smallexample 395 1.1 mrg and a use-after-free might use 396 1.1 mrg @smallexample 397 1.1 mrg use after 'free' here; memory was freed at (2) 398 1.1 mrg @end smallexample 399 1.1 mrg @end itemize 400 1.1 mrg 401 1.1 mrg At this point we can emit the diagnostic. 402 1.1 mrg 403 1.1 mrg @subsection Limitations 404 1.1 mrg 405 1.1 mrg @itemize @bullet 406 1.1 mrg @item 407 1.1 mrg Only for C so far 408 1.1 mrg @item 409 1.1 mrg The implementation of call summaries is currently very simplistic. 410 1.1 mrg @item 411 1.1 mrg Lack of function pointer analysis 412 1.1 mrg @item 413 1.1 mrg The constraint-handling code assumes reflexivity in some places 414 1.1 mrg (that values are equal to themselves), which is not the case for NaN. 415 1.1 mrg As a simple workaround, constraints on floating-point values are 416 1.1 mrg currently ignored. 417 1.1 mrg @item 418 1.1 mrg There are various other limitations in the region model (grep for TODO/xfail 419 1.1 mrg in the testsuite). 420 1.1 mrg @item 421 1.1 mrg The constraint_manager's implementation of transitivity is currently too 422 1.1 mrg expensive to enable by default and so must be manually enabled via 423 1.1 mrg @option{-fanalyzer-transitivity}). 424 1.1 mrg @item 425 1.1 mrg The checkers are currently hardcoded and don't allow for user extensibility 426 1.1 mrg (e.g. adding allocate/release pairs). 427 1.1 mrg @item 428 1.1 mrg Although the analyzer's test suite has a proof-of-concept test case for 429 1.1 mrg LTO, LTO support hasn't had extensive testing. There are various 430 1.1 mrg lang-specific things in the analyzer that assume C rather than LTO. 431 1.1 mrg For example, SSA names are printed to the user in ``raw'' form, rather 432 1.1 mrg than printing the underlying variable name. 433 1.1 mrg @end itemize 434 1.1 mrg 435 1.1 mrg @node Debugging the Analyzer 436 1.1 mrg @section Debugging the Analyzer 437 1.1 mrg @cindex analyzer, debugging 438 1.1 mrg @cindex static analyzer, debugging 439 1.1 mrg 440 1.1 mrg @subsection Special Functions for Debugging the Analyzer 441 1.1 mrg 442 1.1 mrg The analyzer recognizes various special functions by name, for use 443 1.1 mrg in debugging the analyzer. Declarations can be seen in the testsuite 444 1.1 mrg in @file{analyzer-decls.h}. None of these functions are actually 445 1.1 mrg implemented. 446 1.1 mrg 447 1.1 mrg Add: 448 1.1 mrg @smallexample 449 1.1 mrg __analyzer_break (); 450 1.1 mrg @end smallexample 451 1.1 mrg to the source being analyzed to trigger a breakpoint in the analyzer when 452 1.1 mrg that source is reached. By putting a series of these in the source, it's 453 1.1 mrg much easier to effectively step through the program state as it's analyzed. 454 1.1 mrg 455 1.1.1.2 mrg The analyzer handles: 456 1.1.1.2 mrg 457 1.1.1.2 mrg @smallexample 458 1.1.1.2 mrg __analyzer_describe (0, expr); 459 1.1.1.2 mrg @end smallexample 460 1.1.1.2 mrg 461 1.1.1.2 mrg by emitting a warning describing the 2nd argument (which can be of any 462 1.1.1.2 mrg type), at a verbosity level given by the 1st argument. This is for use when 463 1.1.1.2 mrg debugging, and may be of use in DejaGnu tests. 464 1.1.1.2 mrg 465 1.1 mrg @smallexample 466 1.1 mrg __analyzer_dump (); 467 1.1 mrg @end smallexample 468 1.1 mrg 469 1.1 mrg will dump the copious information about the analyzer's state each time it 470 1.1 mrg reaches the call in its traversal of the source. 471 1.1 mrg 472 1.1 mrg @smallexample 473 1.1.1.2 mrg extern void __analyzer_dump_capacity (const void *ptr); 474 1.1.1.2 mrg @end smallexample 475 1.1.1.2 mrg 476 1.1.1.2 mrg will emit a warning describing the capacity of the base region of 477 1.1.1.2 mrg the region pointed to by the 1st argument. 478 1.1.1.2 mrg 479 1.1.1.2 mrg @smallexample 480 1.1.1.2 mrg extern void __analyzer_dump_escaped (void); 481 1.1.1.2 mrg @end smallexample 482 1.1.1.2 mrg 483 1.1.1.2 mrg will emit a warning giving the number of decls that have escaped on this 484 1.1.1.2 mrg analysis path, followed by a comma-separated list of their names, 485 1.1.1.2 mrg in alphabetical order. 486 1.1.1.2 mrg 487 1.1.1.2 mrg @smallexample 488 1.1 mrg __analyzer_dump_path (); 489 1.1 mrg @end smallexample 490 1.1 mrg 491 1.1 mrg will emit a placeholder ``note'' diagnostic with a path to that call site, 492 1.1 mrg if the analyzer finds a feasible path to it. 493 1.1 mrg 494 1.1 mrg The builtin @code{__analyzer_dump_exploded_nodes} will emit a warning 495 1.1 mrg after analysis containing information on all of the exploded nodes at that 496 1.1 mrg program point: 497 1.1 mrg 498 1.1 mrg @smallexample 499 1.1 mrg __analyzer_dump_exploded_nodes (0); 500 1.1 mrg @end smallexample 501 1.1 mrg 502 1.1 mrg will output the number of ``processed'' nodes, and the IDs of 503 1.1 mrg both ``processed'' and ``merger'' nodes, such as: 504 1.1 mrg 505 1.1 mrg @smallexample 506 1.1 mrg warning: 2 processed enodes: [EN: 56, EN: 58] merger(s): [EN: 54-55, EN: 57, EN: 59] 507 1.1 mrg @end smallexample 508 1.1 mrg 509 1.1 mrg With a non-zero argument 510 1.1 mrg 511 1.1 mrg @smallexample 512 1.1 mrg __analyzer_dump_exploded_nodes (1); 513 1.1 mrg @end smallexample 514 1.1 mrg 515 1.1 mrg it will also dump all of the states within the ``processed'' nodes. 516 1.1 mrg 517 1.1 mrg @smallexample 518 1.1 mrg __analyzer_dump_region_model (); 519 1.1 mrg @end smallexample 520 1.1 mrg will dump the region_model's state to stderr. 521 1.1 mrg 522 1.1 mrg @smallexample 523 1.1.1.2 mrg __analyzer_dump_state ("malloc", ptr); 524 1.1.1.2 mrg @end smallexample 525 1.1.1.2 mrg 526 1.1.1.2 mrg will emit a warning describing the state of the 2nd argument 527 1.1.1.2 mrg (which can be of any type) with respect to the state machine with 528 1.1.1.2 mrg a name matching the 1st argument (which must be a string literal). 529 1.1.1.2 mrg This is for use when debugging, and may be of use in DejaGnu tests. 530 1.1.1.2 mrg 531 1.1.1.2 mrg @smallexample 532 1.1 mrg __analyzer_eval (expr); 533 1.1 mrg @end smallexample 534 1.1 mrg will emit a warning with text "TRUE", FALSE" or "UNKNOWN" based on the 535 1.1 mrg truthfulness of the argument. This is useful for writing DejaGnu tests. 536 1.1 mrg 537 1.1 mrg 538 1.1 mrg @subsection Other Debugging Techniques 539 1.1 mrg 540 1.1.1.2 mrg The option @option{-fdump-analyzer-json} will dump both the supergraph 541 1.1.1.2 mrg and the exploded graph in compressed JSON form. 542 1.1.1.2 mrg 543 1.1 mrg One approach when tracking down where a particular bogus state is 544 1.1 mrg introduced into the @code{exploded_graph} is to add custom code to 545 1.1.1.2 mrg @code{program_state::validate}. 546 1.1 mrg 547 1.1.1.2 mrg The debug function @code{region::is_named_decl_p} can be used when debugging, 548 1.1.1.2 mrg such as for assertions and conditional breakpoints. For example, when 549 1.1.1.2 mrg tracking down a bug in handling a decl called @code{yy_buffer_stack}, I 550 1.1.1.2 mrg temporarily added a: 551 1.1 mrg @smallexample 552 1.1.1.2 mrg gcc_assert (!m_base_region->is_named_decl_p ("yy_buffer_stack")); 553 1.1 mrg @end smallexample 554 1.1.1.2 mrg to @code{binding_cluster::mark_as_escaped} to trap a point where 555 1.1.1.2 mrg @code{yy_buffer_stack} was mistakenly being treated as having escaped. 556