1 @c Copyright (C) 2006-2022 Free Software Foundation, Inc. 2 @c Free Software Foundation, Inc. 3 @c This is part of the GCC manual. 4 @c For copying conditions, see the file gcc.texi. 5 6 @c --------------------------------------------------------------------- 7 @c Loop Representation 8 @c --------------------------------------------------------------------- 9 10 @node Loop Analysis and Representation 11 @chapter Analysis and Representation of Loops 12 13 GCC provides extensive infrastructure for work with natural loops, i.e., 14 strongly connected components of CFG with only one entry block. This 15 chapter describes representation of loops in GCC, both on GIMPLE and in 16 RTL, as well as the interfaces to loop-related analyses (induction 17 variable analysis and number of iterations analysis). 18 19 @menu 20 * Loop representation:: Representation and analysis of loops. 21 * Loop querying:: Getting information about loops. 22 * Loop manipulation:: Loop manipulation functions. 23 * LCSSA:: Loop-closed SSA form. 24 * Scalar evolutions:: Induction variables on GIMPLE. 25 * loop-iv:: Induction variables on RTL. 26 * Number of iterations:: Number of iterations analysis. 27 * Dependency analysis:: Data dependency analysis. 28 @end menu 29 30 @node Loop representation 31 @section Loop representation 32 @cindex Loop representation 33 @cindex Loop analysis 34 35 This chapter describes the representation of loops in GCC, and functions 36 that can be used to build, modify and analyze this representation. Most 37 of the interfaces and data structures are declared in @file{cfgloop.h}. 38 Loop structures are analyzed and this information disposed or updated 39 at the discretion of individual passes. Still most of the generic 40 CFG manipulation routines are aware of loop structures and try to 41 keep them up-to-date. By this means an increasing part of the 42 compilation pipeline is setup to maintain loop structure across 43 passes to allow attaching meta information to individual loops 44 for consumption by later passes. 45 46 In general, a natural loop has one entry block (header) and possibly 47 several back edges (latches) leading to the header from the inside of 48 the loop. Loops with several latches may appear if several loops share 49 a single header, or if there is a branching in the middle of the loop. 50 The representation of loops in GCC however allows only loops with a 51 single latch. During loop analysis, headers of such loops are split and 52 forwarder blocks are created in order to disambiguate their structures. 53 Heuristic based on profile information and structure of the induction 54 variables in the loops is used to determine whether the latches 55 correspond to sub-loops or to control flow in a single loop. This means 56 that the analysis sometimes changes the CFG, and if you run it in the 57 middle of an optimization pass, you must be able to deal with the new 58 blocks. You may avoid CFG changes by passing 59 @code{LOOPS_MAY_HAVE_MULTIPLE_LATCHES} flag to the loop discovery, 60 note however that most other loop manipulation functions will not work 61 correctly for loops with multiple latch edges (the functions that only 62 query membership of blocks to loops and subloop relationships, or 63 enumerate and test loop exits, can be expected to work). 64 65 Body of the loop is the set of blocks that are dominated by its header, 66 and reachable from its latch against the direction of edges in CFG@. The 67 loops are organized in a containment hierarchy (tree) such that all the 68 loops immediately contained inside loop L are the children of L in the 69 tree. This tree is represented by the @code{struct loops} structure. 70 The root of this tree is a fake loop that contains all blocks in the 71 function. Each of the loops is represented in a @code{struct loop} 72 structure. Each loop is assigned an index (@code{num} field of the 73 @code{struct loop} structure), and the pointer to the loop is stored in 74 the corresponding field of the @code{larray} vector in the loops 75 structure. The indices do not have to be continuous, there may be 76 empty (@code{NULL}) entries in the @code{larray} created by deleting 77 loops. Also, there is no guarantee on the relative order of a loop 78 and its subloops in the numbering. The index of a loop never changes. 79 80 The entries of the @code{larray} field should not be accessed directly. 81 The function @code{get_loop} returns the loop description for a loop with 82 the given index. @code{number_of_loops} function returns number of loops 83 in the function. To traverse all loops, use a range-based for loop with 84 class @code{loops_list} instance. The @code{flags} argument passed to the 85 constructor function of class @code{loops_list} is used to determine the 86 direction of traversal and the set of loops visited. Each loop is 87 guaranteed to be visited exactly once, regardless of the changes to the 88 loop tree, and the loops may be removed during the traversal. The newly 89 created loops are never traversed, if they need to be visited, this must 90 be done separately after their creation. 91 92 Each basic block contains the reference to the innermost loop it belongs 93 to (@code{loop_father}). For this reason, it is only possible to have 94 one @code{struct loops} structure initialized at the same time for each 95 CFG@. The global variable @code{current_loops} contains the 96 @code{struct loops} structure. Many of the loop manipulation functions 97 assume that dominance information is up-to-date. 98 99 The loops are analyzed through @code{loop_optimizer_init} function. The 100 argument of this function is a set of flags represented in an integer 101 bitmask. These flags specify what other properties of the loop 102 structures should be calculated/enforced and preserved later: 103 104 @itemize 105 @item @code{LOOPS_MAY_HAVE_MULTIPLE_LATCHES}: If this flag is set, no 106 changes to CFG will be performed in the loop analysis, in particular, 107 loops with multiple latch edges will not be disambiguated. If a loop 108 has multiple latches, its latch block is set to NULL@. Most of 109 the loop manipulation functions will not work for loops in this shape. 110 No other flags that require CFG changes can be passed to 111 loop_optimizer_init. 112 @item @code{LOOPS_HAVE_PREHEADERS}: Forwarder blocks are created in such 113 a way that each loop has only one entry edge, and additionally, the 114 source block of this entry edge has only one successor. This creates a 115 natural place where the code can be moved out of the loop, and ensures 116 that the entry edge of the loop leads from its immediate super-loop. 117 @item @code{LOOPS_HAVE_SIMPLE_LATCHES}: Forwarder blocks are created to 118 force the latch block of each loop to have only one successor. This 119 ensures that the latch of the loop does not belong to any of its 120 sub-loops, and makes manipulation with the loops significantly easier. 121 Most of the loop manipulation functions assume that the loops are in 122 this shape. Note that with this flag, the ``normal'' loop without any 123 control flow inside and with one exit consists of two basic blocks. 124 @item @code{LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS}: Basic blocks and 125 edges in the strongly connected components that are not natural loops 126 (have more than one entry block) are marked with 127 @code{BB_IRREDUCIBLE_LOOP} and @code{EDGE_IRREDUCIBLE_LOOP} flags. The 128 flag is not set for blocks and edges that belong to natural loops that 129 are in such an irreducible region (but it is set for the entry and exit 130 edges of such a loop, if they lead to/from this region). 131 @item @code{LOOPS_HAVE_RECORDED_EXITS}: The lists of exits are recorded 132 and updated for each loop. This makes some functions (e.g., 133 @code{get_loop_exit_edges}) more efficient. Some functions (e.g., 134 @code{single_exit}) can be used only if the lists of exits are 135 recorded. 136 @end itemize 137 138 These properties may also be computed/enforced later, using functions 139 @code{create_preheaders}, @code{force_single_succ_latches}, 140 @code{mark_irreducible_loops} and @code{record_loop_exits}. 141 The properties can be queried using @code{loops_state_satisfies_p}. 142 143 The memory occupied by the loops structures should be freed with 144 @code{loop_optimizer_finalize} function. When loop structures are 145 setup to be preserved across passes this function reduces the 146 information to be kept up-to-date to a minimum (only 147 @code{LOOPS_MAY_HAVE_MULTIPLE_LATCHES} set). 148 149 The CFG manipulation functions in general do not update loop structures. 150 Specialized versions that additionally do so are provided for the most 151 common tasks. On GIMPLE, @code{cleanup_tree_cfg_loop} function can be 152 used to cleanup CFG while updating the loops structures if 153 @code{current_loops} is set. 154 155 At the moment loop structure is preserved from the start of GIMPLE 156 loop optimizations until the end of RTL loop optimizations. During 157 this time a loop can be tracked by its @code{struct loop} and number. 158 159 @node Loop querying 160 @section Loop querying 161 @cindex Loop querying 162 163 The functions to query the information about loops are declared in 164 @file{cfgloop.h}. Some of the information can be taken directly from 165 the structures. @code{loop_father} field of each basic block contains 166 the innermost loop to that the block belongs. The most useful fields of 167 loop structure (that are kept up-to-date at all times) are: 168 169 @itemize 170 @item @code{header}, @code{latch}: Header and latch basic blocks of the 171 loop. 172 @item @code{num_nodes}: Number of basic blocks in the loop (including 173 the basic blocks of the sub-loops). 174 @item @code{outer}, @code{inner}, @code{next}: The super-loop, the first 175 sub-loop, and the sibling of the loop in the loops tree. 176 @end itemize 177 178 There are other fields in the loop structures, many of them used only by 179 some of the passes, or not updated during CFG changes; in general, they 180 should not be accessed directly. 181 182 The most important functions to query loop structures are: 183 184 @itemize 185 @item @code{loop_depth}: The depth of the loop in the loops tree, i.e., the 186 number of super-loops of the loop. 187 @item @code{flow_loops_dump}: Dumps the information about loops to a 188 file. 189 @item @code{verify_loop_structure}: Checks consistency of the loop 190 structures. 191 @item @code{loop_latch_edge}: Returns the latch edge of a loop. 192 @item @code{loop_preheader_edge}: If loops have preheaders, returns 193 the preheader edge of a loop. 194 @item @code{flow_loop_nested_p}: Tests whether loop is a sub-loop of 195 another loop. 196 @item @code{flow_bb_inside_loop_p}: Tests whether a basic block belongs 197 to a loop (including its sub-loops). 198 @item @code{find_common_loop}: Finds the common super-loop of two loops. 199 @item @code{superloop_at_depth}: Returns the super-loop of a loop with 200 the given depth. 201 @item @code{tree_num_loop_insns}, @code{num_loop_insns}: Estimates the 202 number of insns in the loop, on GIMPLE and on RTL. 203 @item @code{loop_exit_edge_p}: Tests whether edge is an exit from a 204 loop. 205 @item @code{mark_loop_exit_edges}: Marks all exit edges of all loops 206 with @code{EDGE_LOOP_EXIT} flag. 207 @item @code{get_loop_body}, @code{get_loop_body_in_dom_order}, 208 @code{get_loop_body_in_bfs_order}: Enumerates the basic blocks in the 209 loop in depth-first search order in reversed CFG, ordered by dominance 210 relation, and breath-first search order, respectively. 211 @item @code{single_exit}: Returns the single exit edge of the loop, or 212 @code{NULL} if the loop has more than one exit. You can only use this 213 function if LOOPS_HAVE_MARKED_SINGLE_EXITS property is used. 214 @item @code{get_loop_exit_edges}: Enumerates the exit edges of a loop. 215 @item @code{just_once_each_iteration_p}: Returns true if the basic block 216 is executed exactly once during each iteration of a loop (that is, it 217 does not belong to a sub-loop, and it dominates the latch of the loop). 218 @end itemize 219 220 @node Loop manipulation 221 @section Loop manipulation 222 @cindex Loop manipulation 223 224 The loops tree can be manipulated using the following functions: 225 226 @itemize 227 @item @code{flow_loop_tree_node_add}: Adds a node to the tree. 228 @item @code{flow_loop_tree_node_remove}: Removes a node from the tree. 229 @item @code{add_bb_to_loop}: Adds a basic block to a loop. 230 @item @code{remove_bb_from_loops}: Removes a basic block from loops. 231 @end itemize 232 233 Most low-level CFG functions update loops automatically. The following 234 functions handle some more complicated cases of CFG manipulations: 235 236 @itemize 237 @item @code{remove_path}: Removes an edge and all blocks it dominates. 238 @item @code{split_loop_exit_edge}: Splits exit edge of the loop, 239 ensuring that PHI node arguments remain in the loop (this ensures that 240 loop-closed SSA form is preserved). Only useful on GIMPLE. 241 @end itemize 242 243 Finally, there are some higher-level loop transformations implemented. 244 While some of them are written so that they should work on non-innermost 245 loops, they are mostly untested in that case, and at the moment, they 246 are only reliable for the innermost loops: 247 248 @itemize 249 @item @code{create_iv}: Creates a new induction variable. Only works on 250 GIMPLE@. @code{standard_iv_increment_position} can be used to find a 251 suitable place for the iv increment. 252 @item @code{duplicate_loop_body_to_header_edge}, 253 @code{tree_duplicate_loop_body_to_header_edge}: These functions (on RTL and 254 on GIMPLE) duplicate the body of the loop prescribed number of times on 255 one of the edges entering loop header, thus performing either loop 256 unrolling or loop peeling. @code{can_duplicate_loop_p} 257 (@code{can_unroll_loop_p} on GIMPLE) must be true for the duplicated 258 loop. 259 @item @code{loop_version}: This function creates a copy of a loop, and 260 a branch before them that selects one of them depending on the 261 prescribed condition. This is useful for optimizations that need to 262 verify some assumptions in runtime (one of the copies of the loop is 263 usually left unchanged, while the other one is transformed in some way). 264 @item @code{tree_unroll_loop}: Unrolls the loop, including peeling the 265 extra iterations to make the number of iterations divisible by unroll 266 factor, updating the exit condition, and removing the exits that now 267 cannot be taken. Works only on GIMPLE. 268 @end itemize 269 270 @node LCSSA 271 @section Loop-closed SSA form 272 @cindex LCSSA 273 @cindex Loop-closed SSA form 274 275 Throughout the loop optimizations on tree level, one extra condition is 276 enforced on the SSA form: No SSA name is used outside of the loop in 277 that it is defined. The SSA form satisfying this condition is called 278 ``loop-closed SSA form'' -- LCSSA@. To enforce LCSSA, PHI nodes must be 279 created at the exits of the loops for the SSA names that are used 280 outside of them. Only the real operands (not virtual SSA names) are 281 held in LCSSA, in order to save memory. 282 283 There are various benefits of LCSSA: 284 285 @itemize 286 @item Many optimizations (value range analysis, final value 287 replacement) are interested in the values that are defined in the loop 288 and used outside of it, i.e., exactly those for that we create new PHI 289 nodes. 290 @item In induction variable analysis, it is not necessary to specify the 291 loop in that the analysis should be performed -- the scalar evolution 292 analysis always returns the results with respect to the loop in that the 293 SSA name is defined. 294 @item It makes updating of SSA form during loop transformations simpler. 295 Without LCSSA, operations like loop unrolling may force creation of PHI 296 nodes arbitrarily far from the loop, while in LCSSA, the SSA form can be 297 updated locally. However, since we only keep real operands in LCSSA, we 298 cannot use this advantage (we could have local updating of real 299 operands, but it is not much more efficient than to use generic SSA form 300 updating for it as well; the amount of changes to SSA is the same). 301 @end itemize 302 303 However, it also means LCSSA must be updated. This is usually 304 straightforward, unless you create a new value in loop and use it 305 outside, or unless you manipulate loop exit edges (functions are 306 provided to make these manipulations simple). 307 @code{rewrite_into_loop_closed_ssa} is used to rewrite SSA form to 308 LCSSA, and @code{verify_loop_closed_ssa} to check that the invariant of 309 LCSSA is preserved. 310 311 @node Scalar evolutions 312 @section Scalar evolutions 313 @cindex Scalar evolutions 314 @cindex IV analysis on GIMPLE 315 316 Scalar evolutions (SCEV) are used to represent results of induction 317 variable analysis on GIMPLE@. They enable us to represent variables with 318 complicated behavior in a simple and consistent way (we only use it to 319 express values of polynomial induction variables, but it is possible to 320 extend it). The interfaces to SCEV analysis are declared in 321 @file{tree-scalar-evolution.h}. To use scalar evolutions analysis, 322 @code{scev_initialize} must be used. To stop using SCEV, 323 @code{scev_finalize} should be used. SCEV analysis caches results in 324 order to save time and memory. This cache however is made invalid by 325 most of the loop transformations, including removal of code. If such a 326 transformation is performed, @code{scev_reset} must be called to clean 327 the caches. 328 329 Given an SSA name, its behavior in loops can be analyzed using the 330 @code{analyze_scalar_evolution} function. The returned SCEV however 331 does not have to be fully analyzed and it may contain references to 332 other SSA names defined in the loop. To resolve these (potentially 333 recursive) references, @code{instantiate_parameters} or 334 @code{resolve_mixers} functions must be used. 335 @code{instantiate_parameters} is useful when you use the results of SCEV 336 only for some analysis, and when you work with whole nest of loops at 337 once. It will try replacing all SSA names by their SCEV in all loops, 338 including the super-loops of the current loop, thus providing a complete 339 information about the behavior of the variable in the loop nest. 340 @code{resolve_mixers} is useful if you work with only one loop at a 341 time, and if you possibly need to create code based on the value of the 342 induction variable. It will only resolve the SSA names defined in the 343 current loop, leaving the SSA names defined outside unchanged, even if 344 their evolution in the outer loops is known. 345 346 The SCEV is a normal tree expression, except for the fact that it may 347 contain several special tree nodes. One of them is 348 @code{SCEV_NOT_KNOWN}, used for SSA names whose value cannot be 349 expressed. The other one is @code{POLYNOMIAL_CHREC}. Polynomial chrec 350 has three arguments -- base, step and loop (both base and step may 351 contain further polynomial chrecs). Type of the expression and of base 352 and step must be the same. A variable has evolution 353 @code{POLYNOMIAL_CHREC(base, step, loop)} if it is (in the specified 354 loop) equivalent to @code{x_1} in the following example 355 356 @smallexample 357 while (@dots{}) 358 @{ 359 x_1 = phi (base, x_2); 360 x_2 = x_1 + step; 361 @} 362 @end smallexample 363 364 Note that this includes the language restrictions on the operations. 365 For example, if we compile C code and @code{x} has signed type, then the 366 overflow in addition would cause undefined behavior, and we may assume 367 that this does not happen. Hence, the value with this SCEV cannot 368 overflow (which restricts the number of iterations of such a loop). 369 370 In many cases, one wants to restrict the attention just to affine 371 induction variables. In this case, the extra expressive power of SCEV 372 is not useful, and may complicate the optimizations. In this case, 373 @code{simple_iv} function may be used to analyze a value -- the result 374 is a loop-invariant base and step. 375 376 @node loop-iv 377 @section IV analysis on RTL 378 @cindex IV analysis on RTL 379 380 The induction variable on RTL is simple and only allows analysis of 381 affine induction variables, and only in one loop at once. The interface 382 is declared in @file{cfgloop.h}. Before analyzing induction variables 383 in a loop L, @code{iv_analysis_loop_init} function must be called on L. 384 After the analysis (possibly calling @code{iv_analysis_loop_init} for 385 several loops) is finished, @code{iv_analysis_done} should be called. 386 The following functions can be used to access the results of the 387 analysis: 388 389 @itemize 390 @item @code{iv_analyze}: Analyzes a single register used in the given 391 insn. If no use of the register in this insn is found, the following 392 insns are scanned, so that this function can be called on the insn 393 returned by get_condition. 394 @item @code{iv_analyze_result}: Analyzes result of the assignment in the 395 given insn. 396 @item @code{iv_analyze_expr}: Analyzes a more complicated expression. 397 All its operands are analyzed by @code{iv_analyze}, and hence they must 398 be used in the specified insn or one of the following insns. 399 @end itemize 400 401 The description of the induction variable is provided in @code{struct 402 rtx_iv}. In order to handle subregs, the representation is a bit 403 complicated; if the value of the @code{extend} field is not 404 @code{UNKNOWN}, the value of the induction variable in the i-th 405 iteration is 406 407 @smallexample 408 delta + mult * extend_@{extend_mode@} (subreg_@{mode@} (base + i * step)), 409 @end smallexample 410 411 with the following exception: if @code{first_special} is true, then the 412 value in the first iteration (when @code{i} is zero) is @code{delta + 413 mult * base}. However, if @code{extend} is equal to @code{UNKNOWN}, 414 then @code{first_special} must be false, @code{delta} 0, @code{mult} 1 415 and the value in the i-th iteration is 416 417 @smallexample 418 subreg_@{mode@} (base + i * step) 419 @end smallexample 420 421 The function @code{get_iv_value} can be used to perform these 422 calculations. 423 424 @node Number of iterations 425 @section Number of iterations analysis 426 @cindex Number of iterations analysis 427 428 Both on GIMPLE and on RTL, there are functions available to determine 429 the number of iterations of a loop, with a similar interface. The 430 number of iterations of a loop in GCC is defined as the number of 431 executions of the loop latch. In many cases, it is not possible to 432 determine the number of iterations unconditionally -- the determined 433 number is correct only if some assumptions are satisfied. The analysis 434 tries to verify these conditions using the information contained in the 435 program; if it fails, the conditions are returned together with the 436 result. The following information and conditions are provided by the 437 analysis: 438 439 @itemize 440 @item @code{assumptions}: If this condition is false, the rest of 441 the information is invalid. 442 @item @code{noloop_assumptions} on RTL, @code{may_be_zero} on GIMPLE: If 443 this condition is true, the loop exits in the first iteration. 444 @item @code{infinite}: If this condition is true, the loop is infinite. 445 This condition is only available on RTL@. On GIMPLE, conditions for 446 finiteness of the loop are included in @code{assumptions}. 447 @item @code{niter_expr} on RTL, @code{niter} on GIMPLE: The expression 448 that gives number of iterations. The number of iterations is defined as 449 the number of executions of the loop latch. 450 @end itemize 451 452 Both on GIMPLE and on RTL, it necessary for the induction variable 453 analysis framework to be initialized (SCEV on GIMPLE, loop-iv on RTL). 454 On GIMPLE, the results are stored to @code{struct tree_niter_desc} 455 structure. Number of iterations before the loop is exited through a 456 given exit can be determined using @code{number_of_iterations_exit} 457 function. On RTL, the results are returned in @code{struct niter_desc} 458 structure. The corresponding function is named 459 @code{check_simple_exit}. There are also functions that pass through 460 all the exits of a loop and try to find one with easy to determine 461 number of iterations -- @code{find_loop_niter} on GIMPLE and 462 @code{find_simple_exit} on RTL@. Finally, there are functions that 463 provide the same information, but additionally cache it, so that 464 repeated calls to number of iterations are not so costly -- 465 @code{number_of_latch_executions} on GIMPLE and @code{get_simple_loop_desc} 466 on RTL. 467 468 Note that some of these functions may behave slightly differently than 469 others -- some of them return only the expression for the number of 470 iterations, and fail if there are some assumptions. The function 471 @code{number_of_latch_executions} works only for single-exit loops. 472 The function @code{number_of_cond_exit_executions} can be used to 473 determine number of executions of the exit condition of a single-exit 474 loop (i.e., the @code{number_of_latch_executions} increased by one). 475 476 On GIMPLE, below constraint flags affect semantics of some APIs of number 477 of iterations analyzer: 478 479 @itemize 480 @item @code{LOOP_C_INFINITE}: If this constraint flag is set, the loop 481 is known to be infinite. APIs like @code{number_of_iterations_exit} can 482 return false directly without doing any analysis. 483 @item @code{LOOP_C_FINITE}: If this constraint flag is set, the loop is 484 known to be finite, in other words, loop's number of iterations can be 485 computed with @code{assumptions} be true. 486 @end itemize 487 488 Generally, the constraint flags are set/cleared by consumers which are 489 loop optimizers. It's also the consumers' responsibility to set/clear 490 constraints correctly. Failing to do that might result in hard to track 491 down bugs in scev/niter consumers. One typical use case is vectorizer: 492 it drives number of iterations analyzer by setting @code{LOOP_C_FINITE} 493 and vectorizes possibly infinite loop by versioning loop with analysis 494 result. In return, constraints set by consumers can also help number of 495 iterations analyzer in following optimizers. For example, @code{niter} 496 of a loop versioned under @code{assumptions} is valid unconditionally. 497 498 Other constraints may be added in the future, for example, a constraint 499 indicating that loops' latch must roll thus @code{may_be_zero} would be 500 false unconditionally. 501 502 @node Dependency analysis 503 @section Data Dependency Analysis 504 @cindex Data Dependency Analysis 505 506 The code for the data dependence analysis can be found in 507 @file{tree-data-ref.cc} and its interface and data structures are 508 described in @file{tree-data-ref.h}. The function that computes the 509 data dependences for all the array and pointer references for a given 510 loop is @code{compute_data_dependences_for_loop}. This function is 511 currently used by the linear loop transform and the vectorization 512 passes. Before calling this function, one has to allocate two vectors: 513 a first vector will contain the set of data references that are 514 contained in the analyzed loop body, and the second vector will contain 515 the dependence relations between the data references. Thus if the 516 vector of data references is of size @code{n}, the vector containing the 517 dependence relations will contain @code{n*n} elements. However if the 518 analyzed loop contains side effects, such as calls that potentially can 519 interfere with the data references in the current analyzed loop, the 520 analysis stops while scanning the loop body for data references, and 521 inserts a single @code{chrec_dont_know} in the dependence relation 522 array. 523 524 The data references are discovered in a particular order during the 525 scanning of the loop body: the loop body is analyzed in execution order, 526 and the data references of each statement are pushed at the end of the 527 data reference array. Two data references syntactically occur in the 528 program in the same order as in the array of data references. This 529 syntactic order is important in some classical data dependence tests, 530 and mapping this order to the elements of this array avoids costly 531 queries to the loop body representation. 532 533 Three types of data references are currently handled: ARRAY_REF, 534 INDIRECT_REF and COMPONENT_REF@. The data structure for the data reference 535 is @code{data_reference}, where @code{data_reference_p} is a name of a 536 pointer to the data reference structure. The structure contains the 537 following elements: 538 539 @itemize 540 @item @code{base_object_info}: Provides information about the base object 541 of the data reference and its access functions. These access functions 542 represent the evolution of the data reference in the loop relative to 543 its base, in keeping with the classical meaning of the data reference 544 access function for the support of arrays. For example, for a reference 545 @code{a.b[i][j]}, the base object is @code{a.b} and the access functions, 546 one for each array subscript, are: 547 @code{@{i_init, + i_step@}_1, @{j_init, +, j_step@}_2}. 548 549 @item @code{first_location_in_loop}: Provides information about the first 550 location accessed by the data reference in the loop and about the access 551 function used to represent evolution relative to this location. This data 552 is used to support pointers, and is not used for arrays (for which we 553 have base objects). Pointer accesses are represented as a one-dimensional 554 access that starts from the first location accessed in the loop. For 555 example: 556 557 @smallexample 558 for1 i 559 for2 j 560 *((int *)p + i + j) = a[i][j]; 561 @end smallexample 562 563 The access function of the pointer access is @code{@{0, + 4B@}_for2} 564 relative to @code{p + i}. The access functions of the array are 565 @code{@{i_init, + i_step@}_for1} and @code{@{j_init, +, j_step@}_for2} 566 relative to @code{a}. 567 568 Usually, the object the pointer refers to is either unknown, or we cannot 569 prove that the access is confined to the boundaries of a certain object. 570 571 Two data references can be compared only if at least one of these two 572 representations has all its fields filled for both data references. 573 574 The current strategy for data dependence tests is as follows: 575 If both @code{a} and @code{b} are represented as arrays, compare 576 @code{a.base_object} and @code{b.base_object}; 577 if they are equal, apply dependence tests (use access functions based on 578 base_objects). 579 Else if both @code{a} and @code{b} are represented as pointers, compare 580 @code{a.first_location} and @code{b.first_location}; 581 if they are equal, apply dependence tests (use access functions based on 582 first location). 583 However, if @code{a} and @code{b} are represented differently, only try 584 to prove that the bases are definitely different. 585 586 @item Aliasing information. 587 @item Alignment information. 588 @end itemize 589 590 The structure describing the relation between two data references is 591 @code{data_dependence_relation} and the shorter name for a pointer to 592 such a structure is @code{ddr_p}. This structure contains: 593 594 @itemize 595 @item a pointer to each data reference, 596 @item a tree node @code{are_dependent} that is set to @code{chrec_known} 597 if the analysis has proved that there is no dependence between these two 598 data references, @code{chrec_dont_know} if the analysis was not able to 599 determine any useful result and potentially there could exist a 600 dependence between these data references, and @code{are_dependent} is 601 set to @code{NULL_TREE} if there exist a dependence relation between the 602 data references, and the description of this dependence relation is 603 given in the @code{subscripts}, @code{dir_vects}, and @code{dist_vects} 604 arrays, 605 @item a boolean that determines whether the dependence relation can be 606 represented by a classical distance vector, 607 @item an array @code{subscripts} that contains a description of each 608 subscript of the data references. Given two array accesses a 609 subscript is the tuple composed of the access functions for a given 610 dimension. For example, given @code{A[f1][f2][f3]} and 611 @code{B[g1][g2][g3]}, there are three subscripts: @code{(f1, g1), (f2, 612 g2), (f3, g3)}. 613 @item two arrays @code{dir_vects} and @code{dist_vects} that contain 614 classical representations of the data dependences under the form of 615 direction and distance dependence vectors, 616 @item an array of loops @code{loop_nest} that contains the loops to 617 which the distance and direction vectors refer to. 618 @end itemize 619 620 Several functions for pretty printing the information extracted by the 621 data dependence analysis are available: @code{dump_ddrs} prints with a 622 maximum verbosity the details of a data dependence relations array, 623 @code{dump_dist_dir_vectors} prints only the classical distance and 624 direction vectors for a data dependence relations array, and 625 @code{dump_data_references} prints the details of the data references 626 contained in a data reference array. 627