1 1.1 mrg /* Predictive commoning. 2 1.1 mrg Copyright (C) 2005-2022 Free Software Foundation, Inc. 3 1.1 mrg 4 1.1 mrg This file is part of GCC. 5 1.1 mrg 6 1.1 mrg GCC is free software; you can redistribute it and/or modify it 7 1.1 mrg under the terms of the GNU General Public License as published by the 8 1.1 mrg Free Software Foundation; either version 3, or (at your option) any 9 1.1 mrg later version. 10 1.1 mrg 11 1.1 mrg GCC is distributed in the hope that it will be useful, but WITHOUT 12 1.1 mrg ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 13 1.1 mrg FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License 14 1.1 mrg for more details. 15 1.1 mrg 16 1.1 mrg You should have received a copy of the GNU General Public License 17 1.1 mrg along with GCC; see the file COPYING3. If not see 18 1.1 mrg <http://www.gnu.org/licenses/>. */ 19 1.1 mrg 20 1.1 mrg /* This file implements the predictive commoning optimization. Predictive 21 1.1 mrg commoning can be viewed as CSE around a loop, and with some improvements, 22 1.1 mrg as generalized strength reduction-- i.e., reusing values computed in 23 1.1 mrg earlier iterations of a loop in the later ones. So far, the pass only 24 1.1 mrg handles the most useful case, that is, reusing values of memory references. 25 1.1 mrg If you think this is all just a special case of PRE, you are sort of right; 26 1.1 mrg however, concentrating on loops is simpler, and makes it possible to 27 1.1 mrg incorporate data dependence analysis to detect the opportunities, perform 28 1.1 mrg loop unrolling to avoid copies together with renaming immediately, 29 1.1 mrg and if needed, we could also take register pressure into account. 30 1.1 mrg 31 1.1 mrg Let us demonstrate what is done on an example: 32 1.1 mrg 33 1.1 mrg for (i = 0; i < 100; i++) 34 1.1 mrg { 35 1.1 mrg a[i+2] = a[i] + a[i+1]; 36 1.1 mrg b[10] = b[10] + i; 37 1.1 mrg c[i] = c[99 - i]; 38 1.1 mrg d[i] = d[i + 1]; 39 1.1 mrg } 40 1.1 mrg 41 1.1 mrg 1) We find data references in the loop, and split them to mutually 42 1.1 mrg independent groups (i.e., we find components of a data dependence 43 1.1 mrg graph). We ignore read-read dependences whose distance is not constant. 44 1.1 mrg (TODO -- we could also ignore antidependences). In this example, we 45 1.1 mrg find the following groups: 46 1.1 mrg 47 1.1 mrg a[i]{read}, a[i+1]{read}, a[i+2]{write} 48 1.1 mrg b[10]{read}, b[10]{write} 49 1.1 mrg c[99 - i]{read}, c[i]{write} 50 1.1 mrg d[i + 1]{read}, d[i]{write} 51 1.1 mrg 52 1.1 mrg 2) Inside each of the group, we verify several conditions: 53 1.1 mrg a) all the references must differ in indices only, and the indices 54 1.1 mrg must all have the same step 55 1.1 mrg b) the references must dominate loop latch (and thus, they must be 56 1.1 mrg ordered by dominance relation). 57 1.1 mrg c) the distance of the indices must be a small multiple of the step 58 1.1 mrg We are then able to compute the difference of the references (# of 59 1.1 mrg iterations before they point to the same place as the first of them). 60 1.1 mrg Also, in case there are writes in the loop, we split the groups into 61 1.1 mrg chains whose head is the write whose values are used by the reads in 62 1.1 mrg the same chain. The chains are then processed independently, 63 1.1 mrg making the further transformations simpler. Also, the shorter chains 64 1.1 mrg need the same number of registers, but may require lower unrolling 65 1.1 mrg factor in order to get rid of the copies on the loop latch. 66 1.1 mrg 67 1.1 mrg In our example, we get the following chains (the chain for c is invalid). 68 1.1 mrg 69 1.1 mrg a[i]{read,+0}, a[i+1]{read,-1}, a[i+2]{write,-2} 70 1.1 mrg b[10]{read,+0}, b[10]{write,+0} 71 1.1 mrg d[i + 1]{read,+0}, d[i]{write,+1} 72 1.1 mrg 73 1.1 mrg 3) For each read, we determine the read or write whose value it reuses, 74 1.1 mrg together with the distance of this reuse. I.e. we take the last 75 1.1 mrg reference before it with distance 0, or the last of the references 76 1.1 mrg with the smallest positive distance to the read. Then, we remove 77 1.1 mrg the references that are not used in any of these chains, discard the 78 1.1 mrg empty groups, and propagate all the links so that they point to the 79 1.1 mrg single root reference of the chain (adjusting their distance 80 1.1 mrg appropriately). Some extra care needs to be taken for references with 81 1.1 mrg step 0. In our example (the numbers indicate the distance of the 82 1.1 mrg reuse), 83 1.1 mrg 84 1.1 mrg a[i] --> (*) 2, a[i+1] --> (*) 1, a[i+2] (*) 85 1.1 mrg b[10] --> (*) 1, b[10] (*) 86 1.1 mrg 87 1.1 mrg 4) The chains are combined together if possible. If the corresponding 88 1.1 mrg elements of two chains are always combined together with the same 89 1.1 mrg operator, we remember just the result of this combination, instead 90 1.1 mrg of remembering the values separately. We may need to perform 91 1.1 mrg reassociation to enable combining, for example 92 1.1 mrg 93 1.1 mrg e[i] + f[i+1] + e[i+1] + f[i] 94 1.1 mrg 95 1.1 mrg can be reassociated as 96 1.1 mrg 97 1.1 mrg (e[i] + f[i]) + (e[i+1] + f[i+1]) 98 1.1 mrg 99 1.1 mrg and we can combine the chains for e and f into one chain. 100 1.1 mrg 101 1.1 mrg 5) For each root reference (end of the chain) R, let N be maximum distance 102 1.1 mrg of a reference reusing its value. Variables R0 up to RN are created, 103 1.1 mrg together with phi nodes that transfer values from R1 .. RN to 104 1.1 mrg R0 .. R(N-1). 105 1.1 mrg Initial values are loaded to R0..R(N-1) (in case not all references 106 1.1 mrg must necessarily be accessed and they may trap, we may fail here; 107 1.1 mrg TODO sometimes, the loads could be guarded by a check for the number 108 1.1 mrg of iterations). Values loaded/stored in roots are also copied to 109 1.1 mrg RN. Other reads are replaced with the appropriate variable Ri. 110 1.1 mrg Everything is put to SSA form. 111 1.1 mrg 112 1.1 mrg As a small improvement, if R0 is dead after the root (i.e., all uses of 113 1.1 mrg the value with the maximum distance dominate the root), we can avoid 114 1.1 mrg creating RN and use R0 instead of it. 115 1.1 mrg 116 1.1 mrg In our example, we get (only the parts concerning a and b are shown): 117 1.1 mrg for (i = 0; i < 100; i++) 118 1.1 mrg { 119 1.1 mrg f = phi (a[0], s); 120 1.1 mrg s = phi (a[1], f); 121 1.1 mrg x = phi (b[10], x); 122 1.1 mrg 123 1.1 mrg f = f + s; 124 1.1 mrg a[i+2] = f; 125 1.1 mrg x = x + i; 126 1.1 mrg b[10] = x; 127 1.1 mrg } 128 1.1 mrg 129 1.1 mrg 6) Factor F for unrolling is determined as the smallest common multiple of 130 1.1 mrg (N + 1) for each root reference (N for references for that we avoided 131 1.1 mrg creating RN). If F and the loop is small enough, loop is unrolled F 132 1.1 mrg times. The stores to RN (R0) in the copies of the loop body are 133 1.1 mrg periodically replaced with R0, R1, ... (R1, R2, ...), so that they can 134 1.1 mrg be coalesced and the copies can be eliminated. 135 1.1 mrg 136 1.1 mrg TODO -- copy propagation and other optimizations may change the live 137 1.1 mrg ranges of the temporary registers and prevent them from being coalesced; 138 1.1 mrg this may increase the register pressure. 139 1.1 mrg 140 1.1 mrg In our case, F = 2 and the (main loop of the) result is 141 1.1 mrg 142 1.1 mrg for (i = 0; i < ...; i += 2) 143 1.1 mrg { 144 1.1 mrg f = phi (a[0], f); 145 1.1 mrg s = phi (a[1], s); 146 1.1 mrg x = phi (b[10], x); 147 1.1 mrg 148 1.1 mrg f = f + s; 149 1.1 mrg a[i+2] = f; 150 1.1 mrg x = x + i; 151 1.1 mrg b[10] = x; 152 1.1 mrg 153 1.1 mrg s = s + f; 154 1.1 mrg a[i+3] = s; 155 1.1 mrg x = x + i; 156 1.1 mrg b[10] = x; 157 1.1 mrg } 158 1.1 mrg 159 1.1 mrg Apart from predictive commoning on Load-Load and Store-Load chains, we 160 1.1 mrg also support Store-Store chains -- stores killed by other store can be 161 1.1 mrg eliminated. Given below example: 162 1.1 mrg 163 1.1 mrg for (i = 0; i < n; i++) 164 1.1 mrg { 165 1.1 mrg a[i] = 1; 166 1.1 mrg a[i+2] = 2; 167 1.1 mrg } 168 1.1 mrg 169 1.1 mrg It can be replaced with: 170 1.1 mrg 171 1.1 mrg t0 = a[0]; 172 1.1 mrg t1 = a[1]; 173 1.1 mrg for (i = 0; i < n; i++) 174 1.1 mrg { 175 1.1 mrg a[i] = 1; 176 1.1 mrg t2 = 2; 177 1.1 mrg t0 = t1; 178 1.1 mrg t1 = t2; 179 1.1 mrg } 180 1.1 mrg a[n] = t0; 181 1.1 mrg a[n+1] = t1; 182 1.1 mrg 183 1.1 mrg If the loop runs more than 1 iterations, it can be further simplified into: 184 1.1 mrg 185 1.1 mrg for (i = 0; i < n; i++) 186 1.1 mrg { 187 1.1 mrg a[i] = 1; 188 1.1 mrg } 189 1.1 mrg a[n] = 2; 190 1.1 mrg a[n+1] = 2; 191 1.1 mrg 192 1.1 mrg The interesting part is this can be viewed either as general store motion 193 1.1 mrg or general dead store elimination in either intra/inter-iterations way. 194 1.1 mrg 195 1.1 mrg With trivial effort, we also support load inside Store-Store chains if the 196 1.1 mrg load is dominated by a store statement in the same iteration of loop. You 197 1.1 mrg can see this as a restricted Store-Mixed-Load-Store chain. 198 1.1 mrg 199 1.1 mrg TODO: For now, we don't support store-store chains in multi-exit loops. We 200 1.1 mrg force to not unroll in case of store-store chain even if other chains might 201 1.1 mrg ask for unroll. 202 1.1 mrg 203 1.1 mrg Predictive commoning can be generalized for arbitrary computations (not 204 1.1 mrg just memory loads), and also nontrivial transfer functions (e.g., replacing 205 1.1 mrg i * i with ii_last + 2 * i + 1), to generalize strength reduction. */ 206 1.1 mrg 207 1.1 mrg #include "config.h" 208 1.1 mrg #include "system.h" 209 1.1 mrg #include "coretypes.h" 210 1.1 mrg #include "backend.h" 211 1.1 mrg #include "rtl.h" 212 1.1 mrg #include "tree.h" 213 1.1 mrg #include "gimple.h" 214 1.1 mrg #include "predict.h" 215 1.1 mrg #include "tree-pass.h" 216 1.1 mrg #include "ssa.h" 217 1.1 mrg #include "gimple-pretty-print.h" 218 1.1 mrg #include "alias.h" 219 1.1 mrg #include "fold-const.h" 220 1.1 mrg #include "cfgloop.h" 221 1.1 mrg #include "tree-eh.h" 222 1.1 mrg #include "gimplify.h" 223 1.1 mrg #include "gimple-iterator.h" 224 1.1 mrg #include "gimplify-me.h" 225 1.1 mrg #include "tree-ssa-loop-ivopts.h" 226 1.1 mrg #include "tree-ssa-loop-manip.h" 227 1.1 mrg #include "tree-ssa-loop-niter.h" 228 1.1 mrg #include "tree-ssa-loop.h" 229 1.1 mrg #include "tree-into-ssa.h" 230 1.1 mrg #include "tree-dfa.h" 231 1.1 mrg #include "tree-ssa.h" 232 1.1 mrg #include "tree-data-ref.h" 233 1.1 mrg #include "tree-scalar-evolution.h" 234 1.1 mrg #include "tree-affine.h" 235 1.1 mrg #include "builtins.h" 236 1.1 mrg #include "opts.h" 237 1.1 mrg 238 1.1 mrg /* The maximum number of iterations between the considered memory 239 1.1 mrg references. */ 240 1.1 mrg 241 1.1 mrg #define MAX_DISTANCE (target_avail_regs < 16 ? 4 : 8) 242 1.1 mrg 243 1.1 mrg /* Data references (or phi nodes that carry data reference values across 244 1.1 mrg loop iterations). */ 245 1.1 mrg 246 1.1 mrg typedef class dref_d 247 1.1 mrg { 248 1.1 mrg public: 249 1.1 mrg /* The reference itself. */ 250 1.1 mrg struct data_reference *ref; 251 1.1 mrg 252 1.1 mrg /* The statement in that the reference appears. */ 253 1.1 mrg gimple *stmt; 254 1.1 mrg 255 1.1 mrg /* In case that STMT is a phi node, this field is set to the SSA name 256 1.1 mrg defined by it in replace_phis_by_defined_names (in order to avoid 257 1.1 mrg pointing to phi node that got reallocated in the meantime). */ 258 1.1 mrg tree name_defined_by_phi; 259 1.1 mrg 260 1.1 mrg /* Distance of the reference from the root of the chain (in number of 261 1.1 mrg iterations of the loop). */ 262 1.1 mrg unsigned distance; 263 1.1 mrg 264 1.1 mrg /* Number of iterations offset from the first reference in the component. */ 265 1.1 mrg widest_int offset; 266 1.1 mrg 267 1.1 mrg /* Number of the reference in a component, in dominance ordering. */ 268 1.1 mrg unsigned pos; 269 1.1 mrg 270 1.1 mrg /* True if the memory reference is always accessed when the loop is 271 1.1 mrg entered. */ 272 1.1 mrg unsigned always_accessed : 1; 273 1.1 mrg } *dref; 274 1.1 mrg 275 1.1 mrg 276 1.1 mrg /* Type of the chain of the references. */ 277 1.1 mrg 278 1.1 mrg enum chain_type 279 1.1 mrg { 280 1.1 mrg /* The addresses of the references in the chain are constant. */ 281 1.1 mrg CT_INVARIANT, 282 1.1 mrg 283 1.1 mrg /* There are only loads in the chain. */ 284 1.1 mrg CT_LOAD, 285 1.1 mrg 286 1.1 mrg /* Root of the chain is store, the rest are loads. */ 287 1.1 mrg CT_STORE_LOAD, 288 1.1 mrg 289 1.1 mrg /* There are only stores in the chain. */ 290 1.1 mrg CT_STORE_STORE, 291 1.1 mrg 292 1.1 mrg /* A combination of two chains. */ 293 1.1 mrg CT_COMBINATION 294 1.1 mrg }; 295 1.1 mrg 296 1.1 mrg /* Chains of data references. */ 297 1.1 mrg 298 1.1 mrg typedef struct chain 299 1.1 mrg { 300 1.1 mrg chain (chain_type t) : type (t), op (ERROR_MARK), rslt_type (NULL_TREE), 301 1.1 mrg ch1 (NULL), ch2 (NULL), length (0), init_seq (NULL), fini_seq (NULL), 302 1.1 mrg has_max_use_after (false), all_always_accessed (false), combined (false), 303 1.1 mrg inv_store_elimination (false) {} 304 1.1 mrg 305 1.1 mrg /* Type of the chain. */ 306 1.1 mrg enum chain_type type; 307 1.1 mrg 308 1.1 mrg /* For combination chains, the operator and the two chains that are 309 1.1 mrg combined, and the type of the result. */ 310 1.1 mrg enum tree_code op; 311 1.1 mrg tree rslt_type; 312 1.1 mrg struct chain *ch1, *ch2; 313 1.1 mrg 314 1.1 mrg /* The references in the chain. */ 315 1.1 mrg auto_vec<dref> refs; 316 1.1 mrg 317 1.1 mrg /* The maximum distance of the reference in the chain from the root. */ 318 1.1 mrg unsigned length; 319 1.1 mrg 320 1.1 mrg /* The variables used to copy the value throughout iterations. */ 321 1.1 mrg auto_vec<tree> vars; 322 1.1 mrg 323 1.1 mrg /* Initializers for the variables. */ 324 1.1 mrg auto_vec<tree> inits; 325 1.1 mrg 326 1.1 mrg /* Finalizers for the eliminated stores. */ 327 1.1 mrg auto_vec<tree> finis; 328 1.1 mrg 329 1.1 mrg /* gimple stmts intializing the initial variables of the chain. */ 330 1.1 mrg gimple_seq init_seq; 331 1.1 mrg 332 1.1 mrg /* gimple stmts finalizing the eliminated stores of the chain. */ 333 1.1 mrg gimple_seq fini_seq; 334 1.1 mrg 335 1.1 mrg /* True if there is a use of a variable with the maximal distance 336 1.1 mrg that comes after the root in the loop. */ 337 1.1 mrg unsigned has_max_use_after : 1; 338 1.1 mrg 339 1.1 mrg /* True if all the memory references in the chain are always accessed. */ 340 1.1 mrg unsigned all_always_accessed : 1; 341 1.1 mrg 342 1.1 mrg /* True if this chain was combined together with some other chain. */ 343 1.1 mrg unsigned combined : 1; 344 1.1 mrg 345 1.1 mrg /* True if this is store elimination chain and eliminated stores store 346 1.1 mrg loop invariant value into memory. */ 347 1.1 mrg unsigned inv_store_elimination : 1; 348 1.1 mrg } *chain_p; 349 1.1 mrg 350 1.1 mrg 351 1.1 mrg /* Describes the knowledge about the step of the memory references in 352 1.1 mrg the component. */ 353 1.1 mrg 354 1.1 mrg enum ref_step_type 355 1.1 mrg { 356 1.1 mrg /* The step is zero. */ 357 1.1 mrg RS_INVARIANT, 358 1.1 mrg 359 1.1 mrg /* The step is nonzero. */ 360 1.1 mrg RS_NONZERO, 361 1.1 mrg 362 1.1 mrg /* The step may or may not be nonzero. */ 363 1.1 mrg RS_ANY 364 1.1 mrg }; 365 1.1 mrg 366 1.1 mrg /* Components of the data dependence graph. */ 367 1.1 mrg 368 1.1 mrg struct component 369 1.1 mrg { 370 1.1 mrg component (bool es) : comp_step (RS_ANY), eliminate_store_p (es), 371 1.1 mrg next (NULL) {} 372 1.1 mrg 373 1.1 mrg /* The references in the component. */ 374 1.1 mrg auto_vec<dref> refs; 375 1.1 mrg 376 1.1 mrg /* What we know about the step of the references in the component. */ 377 1.1 mrg enum ref_step_type comp_step; 378 1.1 mrg 379 1.1 mrg /* True if all references in component are stores and we try to do 380 1.1 mrg intra/inter loop iteration dead store elimination. */ 381 1.1 mrg bool eliminate_store_p; 382 1.1 mrg 383 1.1 mrg /* Next component in the list. */ 384 1.1 mrg struct component *next; 385 1.1 mrg }; 386 1.1 mrg 387 1.1 mrg /* A class to encapsulate the global states used for predictive 388 1.1 mrg commoning work on top of one given LOOP. */ 389 1.1 mrg 390 1.1 mrg class pcom_worker 391 1.1 mrg { 392 1.1 mrg public: 393 1.1 mrg pcom_worker (loop_p l) : m_loop (l), m_cache (NULL) {} 394 1.1 mrg 395 1.1 mrg ~pcom_worker () 396 1.1 mrg { 397 1.1 mrg free_data_refs (m_datarefs); 398 1.1 mrg free_dependence_relations (m_dependences); 399 1.1 mrg free_affine_expand_cache (&m_cache); 400 1.1 mrg release_chains (); 401 1.1 mrg } 402 1.1 mrg 403 1.1 mrg pcom_worker (const pcom_worker &) = delete; 404 1.1 mrg pcom_worker &operator= (const pcom_worker &) = delete; 405 1.1 mrg 406 1.1 mrg /* Performs predictive commoning. */ 407 1.1 mrg unsigned tree_predictive_commoning_loop (bool allow_unroll_p); 408 1.1 mrg 409 1.1 mrg /* Perform the predictive commoning optimization for chains, make this 410 1.1 mrg public for being called in callback execute_pred_commoning_cbck. */ 411 1.1 mrg void execute_pred_commoning (bitmap tmp_vars); 412 1.1 mrg 413 1.1 mrg private: 414 1.1 mrg /* The pointer to the given loop. */ 415 1.1 mrg loop_p m_loop; 416 1.1 mrg 417 1.1 mrg /* All data references. */ 418 1.1 mrg auto_vec<data_reference_p, 10> m_datarefs; 419 1.1 mrg 420 1.1 mrg /* All data dependences. */ 421 1.1 mrg auto_vec<ddr_p, 10> m_dependences; 422 1.1 mrg 423 1.1 mrg /* All chains. */ 424 1.1 mrg auto_vec<chain_p> m_chains; 425 1.1 mrg 426 1.1 mrg /* Bitmap of ssa names defined by looparound phi nodes covered by chains. */ 427 1.1 mrg auto_bitmap m_looparound_phis; 428 1.1 mrg 429 1.1 mrg typedef hash_map<tree, name_expansion *> tree_expand_map_t; 430 1.1 mrg /* Cache used by tree_to_aff_combination_expand. */ 431 1.1 mrg tree_expand_map_t *m_cache; 432 1.1 mrg 433 1.1 mrg /* Splits dependence graph to components. */ 434 1.1 mrg struct component *split_data_refs_to_components (); 435 1.1 mrg 436 1.1 mrg /* Check the conditions on references inside each of components COMPS, 437 1.1 mrg and remove the unsuitable components from the list. */ 438 1.1 mrg struct component *filter_suitable_components (struct component *comps); 439 1.1 mrg 440 1.1 mrg /* Find roots of the values and determine distances in components COMPS, 441 1.1 mrg and separates the references to chains. */ 442 1.1 mrg void determine_roots (struct component *comps); 443 1.1 mrg 444 1.1 mrg /* Prepare initializers for chains, and free chains that cannot 445 1.1 mrg be used because the initializers might trap. */ 446 1.1 mrg void prepare_initializers (); 447 1.1 mrg 448 1.1 mrg /* Generates finalizer memory reference for chains. Returns true if 449 1.1 mrg finalizer code generation for chains breaks loop closed ssa form. */ 450 1.1 mrg bool prepare_finalizers (); 451 1.1 mrg 452 1.1 mrg /* Try to combine the chains. */ 453 1.1 mrg void try_combine_chains (); 454 1.1 mrg 455 1.1 mrg /* Frees CHAINS. */ 456 1.1 mrg void release_chains (); 457 1.1 mrg 458 1.1 mrg /* Frees a chain CHAIN. */ 459 1.1 mrg void release_chain (chain_p chain); 460 1.1 mrg 461 1.1 mrg /* Prepare initializers for CHAIN. Returns false if this is impossible 462 1.1 mrg because one of these initializers may trap, true otherwise. */ 463 1.1 mrg bool prepare_initializers_chain (chain_p chain); 464 1.1 mrg 465 1.1 mrg /* Generates finalizer memory references for CHAIN. Returns true 466 1.1 mrg if finalizer code for CHAIN can be generated, otherwise false. */ 467 1.1 mrg bool prepare_finalizers_chain (chain_p chain); 468 1.1 mrg 469 1.1 mrg /* Stores DR_OFFSET (DR) + DR_INIT (DR) to OFFSET. */ 470 1.1 mrg void aff_combination_dr_offset (struct data_reference *dr, aff_tree *offset); 471 1.1 mrg 472 1.1 mrg /* Determines number of iterations of the innermost enclosing loop before 473 1.1 mrg B refers to exactly the same location as A and stores it to OFF. */ 474 1.1 mrg bool determine_offset (struct data_reference *a, struct data_reference *b, 475 1.1 mrg poly_widest_int *off); 476 1.1 mrg 477 1.1 mrg /* Returns true if the component COMP satisfies the conditions 478 1.1 mrg described in 2) at the beginning of this file. */ 479 1.1 mrg bool suitable_component_p (struct component *comp); 480 1.1 mrg 481 1.1 mrg /* Returns true if REF is a valid initializer for ROOT with given 482 1.1 mrg DISTANCE (in iterations of the innermost enclosing loop). */ 483 1.1 mrg bool valid_initializer_p (struct data_reference *ref, unsigned distance, 484 1.1 mrg struct data_reference *root); 485 1.1 mrg 486 1.1 mrg /* Finds looparound phi node of loop that copies the value of REF. */ 487 1.1 mrg gphi *find_looparound_phi (dref ref, dref root); 488 1.1 mrg 489 1.1 mrg /* Find roots of the values and determine distances in the component 490 1.1 mrg COMP. The references are redistributed into chains. */ 491 1.1 mrg void determine_roots_comp (struct component *comp); 492 1.1 mrg 493 1.1 mrg /* For references in CHAIN that are copied around the loop, add the 494 1.1 mrg results of such copies to the chain. */ 495 1.1 mrg void add_looparound_copies (chain_p chain); 496 1.1 mrg 497 1.1 mrg /* Returns the single statement in that NAME is used, excepting 498 1.1 mrg the looparound phi nodes contained in one of the chains. */ 499 1.1 mrg gimple *single_nonlooparound_use (tree name); 500 1.1 mrg 501 1.1 mrg /* Remove statement STMT, as well as the chain of assignments in that 502 1.1 mrg it is used. */ 503 1.1 mrg void remove_stmt (gimple *stmt); 504 1.1 mrg 505 1.1 mrg /* Perform the predictive commoning optimization for a chain CHAIN. */ 506 1.1 mrg void execute_pred_commoning_chain (chain_p chain, bitmap tmp_vars); 507 1.1 mrg 508 1.1 mrg /* Returns the modify statement that uses NAME. */ 509 1.1 mrg gimple *find_use_stmt (tree *name); 510 1.1 mrg 511 1.1 mrg /* If the operation used in STMT is associative and commutative, go 512 1.1 mrg through the tree of the same operations and returns its root. */ 513 1.1 mrg gimple *find_associative_operation_root (gimple *stmt, unsigned *distance); 514 1.1 mrg 515 1.1 mrg /* Returns the common statement in that NAME1 and NAME2 have a use. */ 516 1.1 mrg gimple *find_common_use_stmt (tree *name1, tree *name2); 517 1.1 mrg 518 1.1 mrg /* Checks whether R1 and R2 are combined together using CODE, with the 519 1.1 mrg result in RSLT_TYPE, in order R1 CODE R2 if SWAP is false and in order 520 1.1 mrg R2 CODE R1 if it is true. */ 521 1.1 mrg bool combinable_refs_p (dref r1, dref r2, enum tree_code *code, bool *swap, 522 1.1 mrg tree *rslt_type); 523 1.1 mrg 524 1.1 mrg /* Reassociates the expression in that NAME1 and NAME2 are used so that 525 1.1 mrg they are combined in a single statement, and returns this statement. */ 526 1.1 mrg gimple *reassociate_to_the_same_stmt (tree name1, tree name2); 527 1.1 mrg 528 1.1 mrg /* Returns the statement that combines references R1 and R2. */ 529 1.1 mrg gimple *stmt_combining_refs (dref r1, dref r2); 530 1.1 mrg 531 1.1 mrg /* Tries to combine chains CH1 and CH2 together. */ 532 1.1 mrg chain_p combine_chains (chain_p ch1, chain_p ch2); 533 1.1 mrg }; 534 1.1 mrg 535 1.1 mrg /* Dumps data reference REF to FILE. */ 536 1.1 mrg 537 1.1 mrg extern void dump_dref (FILE *, dref); 538 1.1 mrg void 539 1.1 mrg dump_dref (FILE *file, dref ref) 540 1.1 mrg { 541 1.1 mrg if (ref->ref) 542 1.1 mrg { 543 1.1 mrg fprintf (file, " "); 544 1.1 mrg print_generic_expr (file, DR_REF (ref->ref), TDF_SLIM); 545 1.1 mrg fprintf (file, " (id %u%s)\n", ref->pos, 546 1.1 mrg DR_IS_READ (ref->ref) ? "" : ", write"); 547 1.1 mrg 548 1.1 mrg fprintf (file, " offset "); 549 1.1 mrg print_decs (ref->offset, file); 550 1.1 mrg fprintf (file, "\n"); 551 1.1 mrg 552 1.1 mrg fprintf (file, " distance %u\n", ref->distance); 553 1.1 mrg } 554 1.1 mrg else 555 1.1 mrg { 556 1.1 mrg if (gimple_code (ref->stmt) == GIMPLE_PHI) 557 1.1 mrg fprintf (file, " looparound ref\n"); 558 1.1 mrg else 559 1.1 mrg fprintf (file, " combination ref\n"); 560 1.1 mrg fprintf (file, " in statement "); 561 1.1 mrg print_gimple_stmt (file, ref->stmt, 0, TDF_SLIM); 562 1.1 mrg fprintf (file, "\n"); 563 1.1 mrg fprintf (file, " distance %u\n", ref->distance); 564 1.1 mrg } 565 1.1 mrg 566 1.1 mrg } 567 1.1 mrg 568 1.1 mrg /* Dumps CHAIN to FILE. */ 569 1.1 mrg 570 1.1 mrg extern void dump_chain (FILE *, chain_p); 571 1.1 mrg void 572 1.1 mrg dump_chain (FILE *file, chain_p chain) 573 1.1 mrg { 574 1.1 mrg dref a; 575 1.1 mrg const char *chain_type; 576 1.1 mrg unsigned i; 577 1.1 mrg tree var; 578 1.1 mrg 579 1.1 mrg switch (chain->type) 580 1.1 mrg { 581 1.1 mrg case CT_INVARIANT: 582 1.1 mrg chain_type = "Load motion"; 583 1.1 mrg break; 584 1.1 mrg 585 1.1 mrg case CT_LOAD: 586 1.1 mrg chain_type = "Loads-only"; 587 1.1 mrg break; 588 1.1 mrg 589 1.1 mrg case CT_STORE_LOAD: 590 1.1 mrg chain_type = "Store-loads"; 591 1.1 mrg break; 592 1.1 mrg 593 1.1 mrg case CT_STORE_STORE: 594 1.1 mrg chain_type = "Store-stores"; 595 1.1 mrg break; 596 1.1 mrg 597 1.1 mrg case CT_COMBINATION: 598 1.1 mrg chain_type = "Combination"; 599 1.1 mrg break; 600 1.1 mrg 601 1.1 mrg default: 602 1.1 mrg gcc_unreachable (); 603 1.1 mrg } 604 1.1 mrg 605 1.1 mrg fprintf (file, "%s chain %p%s\n", chain_type, (void *) chain, 606 1.1 mrg chain->combined ? " (combined)" : ""); 607 1.1 mrg if (chain->type != CT_INVARIANT) 608 1.1 mrg fprintf (file, " max distance %u%s\n", chain->length, 609 1.1 mrg chain->has_max_use_after ? "" : ", may reuse first"); 610 1.1 mrg 611 1.1 mrg if (chain->type == CT_COMBINATION) 612 1.1 mrg { 613 1.1 mrg fprintf (file, " equal to %p %s %p in type ", 614 1.1 mrg (void *) chain->ch1, op_symbol_code (chain->op), 615 1.1 mrg (void *) chain->ch2); 616 1.1 mrg print_generic_expr (file, chain->rslt_type, TDF_SLIM); 617 1.1 mrg fprintf (file, "\n"); 618 1.1 mrg } 619 1.1 mrg 620 1.1 mrg if (chain->vars.exists ()) 621 1.1 mrg { 622 1.1 mrg fprintf (file, " vars"); 623 1.1 mrg FOR_EACH_VEC_ELT (chain->vars, i, var) 624 1.1 mrg { 625 1.1 mrg fprintf (file, " "); 626 1.1 mrg print_generic_expr (file, var, TDF_SLIM); 627 1.1 mrg } 628 1.1 mrg fprintf (file, "\n"); 629 1.1 mrg } 630 1.1 mrg 631 1.1 mrg if (chain->inits.exists ()) 632 1.1 mrg { 633 1.1 mrg fprintf (file, " inits"); 634 1.1 mrg FOR_EACH_VEC_ELT (chain->inits, i, var) 635 1.1 mrg { 636 1.1 mrg fprintf (file, " "); 637 1.1 mrg print_generic_expr (file, var, TDF_SLIM); 638 1.1 mrg } 639 1.1 mrg fprintf (file, "\n"); 640 1.1 mrg } 641 1.1 mrg 642 1.1 mrg fprintf (file, " references:\n"); 643 1.1 mrg FOR_EACH_VEC_ELT (chain->refs, i, a) 644 1.1 mrg dump_dref (file, a); 645 1.1 mrg 646 1.1 mrg fprintf (file, "\n"); 647 1.1 mrg } 648 1.1 mrg 649 1.1 mrg /* Dumps CHAINS to FILE. */ 650 1.1 mrg 651 1.1 mrg void 652 1.1 mrg dump_chains (FILE *file, const vec<chain_p> &chains) 653 1.1 mrg { 654 1.1 mrg chain_p chain; 655 1.1 mrg unsigned i; 656 1.1 mrg 657 1.1 mrg FOR_EACH_VEC_ELT (chains, i, chain) 658 1.1 mrg dump_chain (file, chain); 659 1.1 mrg } 660 1.1 mrg 661 1.1 mrg /* Dumps COMP to FILE. */ 662 1.1 mrg 663 1.1 mrg extern void dump_component (FILE *, struct component *); 664 1.1 mrg void 665 1.1 mrg dump_component (FILE *file, struct component *comp) 666 1.1 mrg { 667 1.1 mrg dref a; 668 1.1 mrg unsigned i; 669 1.1 mrg 670 1.1 mrg fprintf (file, "Component%s:\n", 671 1.1 mrg comp->comp_step == RS_INVARIANT ? " (invariant)" : ""); 672 1.1 mrg FOR_EACH_VEC_ELT (comp->refs, i, a) 673 1.1 mrg dump_dref (file, a); 674 1.1 mrg fprintf (file, "\n"); 675 1.1 mrg } 676 1.1 mrg 677 1.1 mrg /* Dumps COMPS to FILE. */ 678 1.1 mrg 679 1.1 mrg extern void dump_components (FILE *, struct component *); 680 1.1 mrg void 681 1.1 mrg dump_components (FILE *file, struct component *comps) 682 1.1 mrg { 683 1.1 mrg struct component *comp; 684 1.1 mrg 685 1.1 mrg for (comp = comps; comp; comp = comp->next) 686 1.1 mrg dump_component (file, comp); 687 1.1 mrg } 688 1.1 mrg 689 1.1 mrg /* Frees a chain CHAIN. */ 690 1.1 mrg 691 1.1 mrg void 692 1.1 mrg pcom_worker::release_chain (chain_p chain) 693 1.1 mrg { 694 1.1 mrg dref ref; 695 1.1 mrg unsigned i; 696 1.1 mrg 697 1.1 mrg if (chain == NULL) 698 1.1 mrg return; 699 1.1 mrg 700 1.1 mrg FOR_EACH_VEC_ELT (chain->refs, i, ref) 701 1.1 mrg free (ref); 702 1.1 mrg 703 1.1 mrg if (chain->init_seq) 704 1.1 mrg gimple_seq_discard (chain->init_seq); 705 1.1 mrg 706 1.1 mrg if (chain->fini_seq) 707 1.1 mrg gimple_seq_discard (chain->fini_seq); 708 1.1 mrg 709 1.1 mrg delete chain; 710 1.1 mrg } 711 1.1 mrg 712 1.1 mrg /* Frees CHAINS. */ 713 1.1 mrg 714 1.1 mrg void 715 1.1 mrg pcom_worker::release_chains () 716 1.1 mrg { 717 1.1 mrg unsigned i; 718 1.1 mrg chain_p chain; 719 1.1 mrg 720 1.1 mrg FOR_EACH_VEC_ELT (m_chains, i, chain) 721 1.1 mrg release_chain (chain); 722 1.1 mrg } 723 1.1 mrg 724 1.1 mrg /* Frees list of components COMPS. */ 725 1.1 mrg 726 1.1 mrg static void 727 1.1 mrg release_components (struct component *comps) 728 1.1 mrg { 729 1.1 mrg struct component *act, *next; 730 1.1 mrg 731 1.1 mrg for (act = comps; act; act = next) 732 1.1 mrg { 733 1.1 mrg next = act->next; 734 1.1 mrg delete act; 735 1.1 mrg } 736 1.1 mrg } 737 1.1 mrg 738 1.1 mrg /* Finds a root of tree given by FATHERS containing A, and performs path 739 1.1 mrg shortening. */ 740 1.1 mrg 741 1.1 mrg static unsigned 742 1.1 mrg component_of (vec<unsigned> &fathers, unsigned a) 743 1.1 mrg { 744 1.1 mrg unsigned root, n; 745 1.1 mrg 746 1.1 mrg for (root = a; root != fathers[root]; root = fathers[root]) 747 1.1 mrg continue; 748 1.1 mrg 749 1.1 mrg for (; a != root; a = n) 750 1.1 mrg { 751 1.1 mrg n = fathers[a]; 752 1.1 mrg fathers[a] = root; 753 1.1 mrg } 754 1.1 mrg 755 1.1 mrg return root; 756 1.1 mrg } 757 1.1 mrg 758 1.1 mrg /* Join operation for DFU. FATHERS gives the tree, SIZES are sizes of the 759 1.1 mrg components, A and B are components to merge. */ 760 1.1 mrg 761 1.1 mrg static void 762 1.1 mrg merge_comps (vec<unsigned> &fathers, vec<unsigned> &sizes, 763 1.1 mrg unsigned a, unsigned b) 764 1.1 mrg { 765 1.1 mrg unsigned ca = component_of (fathers, a); 766 1.1 mrg unsigned cb = component_of (fathers, b); 767 1.1 mrg 768 1.1 mrg if (ca == cb) 769 1.1 mrg return; 770 1.1 mrg 771 1.1 mrg if (sizes[ca] < sizes[cb]) 772 1.1 mrg { 773 1.1 mrg sizes[cb] += sizes[ca]; 774 1.1 mrg fathers[ca] = cb; 775 1.1 mrg } 776 1.1 mrg else 777 1.1 mrg { 778 1.1 mrg sizes[ca] += sizes[cb]; 779 1.1 mrg fathers[cb] = ca; 780 1.1 mrg } 781 1.1 mrg } 782 1.1 mrg 783 1.1 mrg /* Returns true if A is a reference that is suitable for predictive commoning 784 1.1 mrg in the innermost loop that contains it. REF_STEP is set according to the 785 1.1 mrg step of the reference A. */ 786 1.1 mrg 787 1.1 mrg static bool 788 1.1 mrg suitable_reference_p (struct data_reference *a, enum ref_step_type *ref_step) 789 1.1 mrg { 790 1.1 mrg tree ref = DR_REF (a), step = DR_STEP (a); 791 1.1 mrg 792 1.1 mrg if (!step 793 1.1 mrg || TREE_THIS_VOLATILE (ref) 794 1.1 mrg || !is_gimple_reg_type (TREE_TYPE (ref)) 795 1.1 mrg || tree_could_throw_p (ref)) 796 1.1 mrg return false; 797 1.1 mrg 798 1.1 mrg if (integer_zerop (step)) 799 1.1 mrg *ref_step = RS_INVARIANT; 800 1.1 mrg else if (integer_nonzerop (step)) 801 1.1 mrg *ref_step = RS_NONZERO; 802 1.1 mrg else 803 1.1 mrg *ref_step = RS_ANY; 804 1.1 mrg 805 1.1 mrg return true; 806 1.1 mrg } 807 1.1 mrg 808 1.1 mrg /* Stores DR_OFFSET (DR) + DR_INIT (DR) to OFFSET. */ 809 1.1 mrg 810 1.1 mrg void 811 1.1 mrg pcom_worker::aff_combination_dr_offset (struct data_reference *dr, 812 1.1 mrg aff_tree *offset) 813 1.1 mrg { 814 1.1 mrg tree type = TREE_TYPE (DR_OFFSET (dr)); 815 1.1 mrg aff_tree delta; 816 1.1 mrg 817 1.1 mrg tree_to_aff_combination_expand (DR_OFFSET (dr), type, offset, &m_cache); 818 1.1 mrg aff_combination_const (&delta, type, wi::to_poly_widest (DR_INIT (dr))); 819 1.1 mrg aff_combination_add (offset, &delta); 820 1.1 mrg } 821 1.1 mrg 822 1.1 mrg /* Determines number of iterations of the innermost enclosing loop before B 823 1.1 mrg refers to exactly the same location as A and stores it to OFF. If A and 824 1.1 mrg B do not have the same step, they never meet, or anything else fails, 825 1.1 mrg returns false, otherwise returns true. Both A and B are assumed to 826 1.1 mrg satisfy suitable_reference_p. */ 827 1.1 mrg 828 1.1 mrg bool 829 1.1 mrg pcom_worker::determine_offset (struct data_reference *a, 830 1.1 mrg struct data_reference *b, poly_widest_int *off) 831 1.1 mrg { 832 1.1 mrg aff_tree diff, baseb, step; 833 1.1 mrg tree typea, typeb; 834 1.1 mrg 835 1.1 mrg /* Check that both the references access the location in the same type. */ 836 1.1 mrg typea = TREE_TYPE (DR_REF (a)); 837 1.1 mrg typeb = TREE_TYPE (DR_REF (b)); 838 1.1 mrg if (!useless_type_conversion_p (typeb, typea)) 839 1.1 mrg return false; 840 1.1 mrg 841 1.1 mrg /* Check whether the base address and the step of both references is the 842 1.1 mrg same. */ 843 1.1 mrg if (!operand_equal_p (DR_STEP (a), DR_STEP (b), 0) 844 1.1 mrg || !operand_equal_p (DR_BASE_ADDRESS (a), DR_BASE_ADDRESS (b), 0)) 845 1.1 mrg return false; 846 1.1 mrg 847 1.1 mrg if (integer_zerop (DR_STEP (a))) 848 1.1 mrg { 849 1.1 mrg /* If the references have loop invariant address, check that they access 850 1.1 mrg exactly the same location. */ 851 1.1 mrg *off = 0; 852 1.1 mrg return (operand_equal_p (DR_OFFSET (a), DR_OFFSET (b), 0) 853 1.1 mrg && operand_equal_p (DR_INIT (a), DR_INIT (b), 0)); 854 1.1 mrg } 855 1.1 mrg 856 1.1 mrg /* Compare the offsets of the addresses, and check whether the difference 857 1.1 mrg is a multiple of step. */ 858 1.1 mrg aff_combination_dr_offset (a, &diff); 859 1.1 mrg aff_combination_dr_offset (b, &baseb); 860 1.1 mrg aff_combination_scale (&baseb, -1); 861 1.1 mrg aff_combination_add (&diff, &baseb); 862 1.1 mrg 863 1.1 mrg tree_to_aff_combination_expand (DR_STEP (a), TREE_TYPE (DR_STEP (a)), 864 1.1 mrg &step, &m_cache); 865 1.1 mrg return aff_combination_constant_multiple_p (&diff, &step, off); 866 1.1 mrg } 867 1.1 mrg 868 1.1 mrg /* Returns the last basic block in LOOP for that we are sure that 869 1.1 mrg it is executed whenever the loop is entered. */ 870 1.1 mrg 871 1.1 mrg static basic_block 872 1.1 mrg last_always_executed_block (class loop *loop) 873 1.1 mrg { 874 1.1 mrg unsigned i; 875 1.1 mrg auto_vec<edge> exits = get_loop_exit_edges (loop); 876 1.1 mrg edge ex; 877 1.1 mrg basic_block last = loop->latch; 878 1.1 mrg 879 1.1 mrg FOR_EACH_VEC_ELT (exits, i, ex) 880 1.1 mrg last = nearest_common_dominator (CDI_DOMINATORS, last, ex->src); 881 1.1 mrg 882 1.1 mrg return last; 883 1.1 mrg } 884 1.1 mrg 885 1.1 mrg /* Splits dependence graph on DATAREFS described by DEPENDENCES to 886 1.1 mrg components. */ 887 1.1 mrg 888 1.1 mrg struct component * 889 1.1 mrg pcom_worker::split_data_refs_to_components () 890 1.1 mrg { 891 1.1 mrg unsigned i, n = m_datarefs.length (); 892 1.1 mrg unsigned ca, ia, ib, bad; 893 1.1 mrg struct data_reference *dr, *dra, *drb; 894 1.1 mrg struct data_dependence_relation *ddr; 895 1.1 mrg struct component *comp_list = NULL, *comp; 896 1.1 mrg dref dataref; 897 1.1 mrg /* Don't do store elimination if loop has multiple exit edges. */ 898 1.1 mrg bool eliminate_store_p = single_exit (m_loop) != NULL; 899 1.1 mrg basic_block last_always_executed = last_always_executed_block (m_loop); 900 1.1 mrg auto_bitmap no_store_store_comps; 901 1.1 mrg auto_vec<unsigned> comp_father (n + 1); 902 1.1 mrg auto_vec<unsigned> comp_size (n + 1); 903 1.1 mrg comp_father.quick_grow (n + 1); 904 1.1 mrg comp_size.quick_grow (n + 1); 905 1.1 mrg 906 1.1 mrg FOR_EACH_VEC_ELT (m_datarefs, i, dr) 907 1.1 mrg { 908 1.1 mrg if (!DR_REF (dr)) 909 1.1 mrg /* A fake reference for call or asm_expr that may clobber memory; 910 1.1 mrg just fail. */ 911 1.1 mrg return NULL; 912 1.1 mrg /* predcom pass isn't prepared to handle calls with data references. */ 913 1.1 mrg if (is_gimple_call (DR_STMT (dr))) 914 1.1 mrg return NULL; 915 1.1 mrg dr->aux = (void *) (size_t) i; 916 1.1 mrg comp_father[i] = i; 917 1.1 mrg comp_size[i] = 1; 918 1.1 mrg } 919 1.1 mrg 920 1.1 mrg /* A component reserved for the "bad" data references. */ 921 1.1 mrg comp_father[n] = n; 922 1.1 mrg comp_size[n] = 1; 923 1.1 mrg 924 1.1 mrg FOR_EACH_VEC_ELT (m_datarefs, i, dr) 925 1.1 mrg { 926 1.1 mrg enum ref_step_type dummy; 927 1.1 mrg 928 1.1 mrg if (!suitable_reference_p (dr, &dummy)) 929 1.1 mrg { 930 1.1 mrg ia = (unsigned) (size_t) dr->aux; 931 1.1 mrg merge_comps (comp_father, comp_size, n, ia); 932 1.1 mrg } 933 1.1 mrg } 934 1.1 mrg 935 1.1 mrg FOR_EACH_VEC_ELT (m_dependences, i, ddr) 936 1.1 mrg { 937 1.1 mrg poly_widest_int dummy_off; 938 1.1 mrg 939 1.1 mrg if (DDR_ARE_DEPENDENT (ddr) == chrec_known) 940 1.1 mrg continue; 941 1.1 mrg 942 1.1 mrg dra = DDR_A (ddr); 943 1.1 mrg drb = DDR_B (ddr); 944 1.1 mrg 945 1.1 mrg /* Don't do store elimination if there is any unknown dependence for 946 1.1 mrg any store data reference. */ 947 1.1 mrg if ((DR_IS_WRITE (dra) || DR_IS_WRITE (drb)) 948 1.1 mrg && (DDR_ARE_DEPENDENT (ddr) == chrec_dont_know 949 1.1 mrg || DDR_NUM_DIST_VECTS (ddr) == 0)) 950 1.1 mrg eliminate_store_p = false; 951 1.1 mrg 952 1.1 mrg ia = component_of (comp_father, (unsigned) (size_t) dra->aux); 953 1.1 mrg ib = component_of (comp_father, (unsigned) (size_t) drb->aux); 954 1.1 mrg if (ia == ib) 955 1.1 mrg continue; 956 1.1 mrg 957 1.1 mrg bad = component_of (comp_father, n); 958 1.1 mrg 959 1.1 mrg /* If both A and B are reads, we may ignore unsuitable dependences. */ 960 1.1 mrg if (DR_IS_READ (dra) && DR_IS_READ (drb)) 961 1.1 mrg { 962 1.1 mrg if (ia == bad || ib == bad 963 1.1 mrg || !determine_offset (dra, drb, &dummy_off)) 964 1.1 mrg continue; 965 1.1 mrg } 966 1.1 mrg /* If A is read and B write or vice versa and there is unsuitable 967 1.1 mrg dependence, instead of merging both components into a component 968 1.1 mrg that will certainly not pass suitable_component_p, just put the 969 1.1 mrg read into bad component, perhaps at least the write together with 970 1.1 mrg all the other data refs in it's component will be optimizable. */ 971 1.1 mrg else if (DR_IS_READ (dra) && ib != bad) 972 1.1 mrg { 973 1.1 mrg if (ia == bad) 974 1.1 mrg { 975 1.1 mrg bitmap_set_bit (no_store_store_comps, ib); 976 1.1 mrg continue; 977 1.1 mrg } 978 1.1 mrg else if (!determine_offset (dra, drb, &dummy_off)) 979 1.1 mrg { 980 1.1 mrg bitmap_set_bit (no_store_store_comps, ib); 981 1.1 mrg merge_comps (comp_father, comp_size, bad, ia); 982 1.1 mrg continue; 983 1.1 mrg } 984 1.1 mrg } 985 1.1 mrg else if (DR_IS_READ (drb) && ia != bad) 986 1.1 mrg { 987 1.1 mrg if (ib == bad) 988 1.1 mrg { 989 1.1 mrg bitmap_set_bit (no_store_store_comps, ia); 990 1.1 mrg continue; 991 1.1 mrg } 992 1.1 mrg else if (!determine_offset (dra, drb, &dummy_off)) 993 1.1 mrg { 994 1.1 mrg bitmap_set_bit (no_store_store_comps, ia); 995 1.1 mrg merge_comps (comp_father, comp_size, bad, ib); 996 1.1 mrg continue; 997 1.1 mrg } 998 1.1 mrg } 999 1.1 mrg else if (DR_IS_WRITE (dra) && DR_IS_WRITE (drb) 1000 1.1 mrg && ia != bad && ib != bad 1001 1.1 mrg && !determine_offset (dra, drb, &dummy_off)) 1002 1.1 mrg { 1003 1.1 mrg merge_comps (comp_father, comp_size, bad, ia); 1004 1.1 mrg merge_comps (comp_father, comp_size, bad, ib); 1005 1.1 mrg continue; 1006 1.1 mrg } 1007 1.1 mrg 1008 1.1 mrg merge_comps (comp_father, comp_size, ia, ib); 1009 1.1 mrg } 1010 1.1 mrg 1011 1.1 mrg if (eliminate_store_p) 1012 1.1 mrg { 1013 1.1 mrg tree niters = number_of_latch_executions (m_loop); 1014 1.1 mrg 1015 1.1 mrg /* Don't do store elimination if niters info is unknown because stores 1016 1.1 mrg in the last iteration can't be eliminated and we need to recover it 1017 1.1 mrg after loop. */ 1018 1.1 mrg eliminate_store_p = (niters != NULL_TREE && niters != chrec_dont_know); 1019 1.1 mrg } 1020 1.1 mrg 1021 1.1 mrg auto_vec<struct component *> comps; 1022 1.1 mrg comps.safe_grow_cleared (n, true); 1023 1.1 mrg bad = component_of (comp_father, n); 1024 1.1 mrg FOR_EACH_VEC_ELT (m_datarefs, i, dr) 1025 1.1 mrg { 1026 1.1 mrg ia = (unsigned) (size_t) dr->aux; 1027 1.1 mrg ca = component_of (comp_father, ia); 1028 1.1 mrg if (ca == bad) 1029 1.1 mrg continue; 1030 1.1 mrg 1031 1.1 mrg comp = comps[ca]; 1032 1.1 mrg if (!comp) 1033 1.1 mrg { 1034 1.1 mrg comp = new component (eliminate_store_p); 1035 1.1 mrg comp->refs.reserve_exact (comp_size[ca]); 1036 1.1 mrg comps[ca] = comp; 1037 1.1 mrg } 1038 1.1 mrg 1039 1.1 mrg dataref = XCNEW (class dref_d); 1040 1.1 mrg dataref->ref = dr; 1041 1.1 mrg dataref->stmt = DR_STMT (dr); 1042 1.1 mrg dataref->offset = 0; 1043 1.1 mrg dataref->distance = 0; 1044 1.1 mrg 1045 1.1 mrg dataref->always_accessed 1046 1.1 mrg = dominated_by_p (CDI_DOMINATORS, last_always_executed, 1047 1.1 mrg gimple_bb (dataref->stmt)); 1048 1.1 mrg dataref->pos = comp->refs.length (); 1049 1.1 mrg comp->refs.quick_push (dataref); 1050 1.1 mrg } 1051 1.1 mrg 1052 1.1 mrg if (eliminate_store_p) 1053 1.1 mrg { 1054 1.1 mrg bitmap_iterator bi; 1055 1.1 mrg EXECUTE_IF_SET_IN_BITMAP (no_store_store_comps, 0, ia, bi) 1056 1.1 mrg { 1057 1.1 mrg ca = component_of (comp_father, ia); 1058 1.1 mrg if (ca != bad) 1059 1.1 mrg comps[ca]->eliminate_store_p = false; 1060 1.1 mrg } 1061 1.1 mrg } 1062 1.1 mrg 1063 1.1 mrg for (i = 0; i < n; i++) 1064 1.1 mrg { 1065 1.1 mrg comp = comps[i]; 1066 1.1 mrg if (comp) 1067 1.1 mrg { 1068 1.1 mrg comp->next = comp_list; 1069 1.1 mrg comp_list = comp; 1070 1.1 mrg } 1071 1.1 mrg } 1072 1.1 mrg return comp_list; 1073 1.1 mrg } 1074 1.1 mrg 1075 1.1 mrg /* Returns true if the component COMP satisfies the conditions 1076 1.1 mrg described in 2) at the beginning of this file. */ 1077 1.1 mrg 1078 1.1 mrg bool 1079 1.1 mrg pcom_worker::suitable_component_p (struct component *comp) 1080 1.1 mrg { 1081 1.1 mrg unsigned i; 1082 1.1 mrg dref a, first; 1083 1.1 mrg basic_block ba, bp = m_loop->header; 1084 1.1 mrg bool ok, has_write = false; 1085 1.1 mrg 1086 1.1 mrg FOR_EACH_VEC_ELT (comp->refs, i, a) 1087 1.1 mrg { 1088 1.1 mrg ba = gimple_bb (a->stmt); 1089 1.1 mrg 1090 1.1 mrg if (!just_once_each_iteration_p (m_loop, ba)) 1091 1.1 mrg return false; 1092 1.1 mrg 1093 1.1 mrg gcc_assert (dominated_by_p (CDI_DOMINATORS, ba, bp)); 1094 1.1 mrg bp = ba; 1095 1.1 mrg 1096 1.1 mrg if (DR_IS_WRITE (a->ref)) 1097 1.1 mrg has_write = true; 1098 1.1 mrg } 1099 1.1 mrg 1100 1.1 mrg first = comp->refs[0]; 1101 1.1 mrg ok = suitable_reference_p (first->ref, &comp->comp_step); 1102 1.1 mrg gcc_assert (ok); 1103 1.1 mrg first->offset = 0; 1104 1.1 mrg 1105 1.1 mrg for (i = 1; comp->refs.iterate (i, &a); i++) 1106 1.1 mrg { 1107 1.1 mrg /* Polynomial offsets are no use, since we need to know the 1108 1.1 mrg gap between iteration numbers at compile time. */ 1109 1.1 mrg poly_widest_int offset; 1110 1.1 mrg if (!determine_offset (first->ref, a->ref, &offset) 1111 1.1 mrg || !offset.is_constant (&a->offset)) 1112 1.1 mrg return false; 1113 1.1 mrg 1114 1.1 mrg enum ref_step_type a_step; 1115 1.1 mrg gcc_checking_assert (suitable_reference_p (a->ref, &a_step) 1116 1.1 mrg && a_step == comp->comp_step); 1117 1.1 mrg } 1118 1.1 mrg 1119 1.1 mrg /* If there is a write inside the component, we must know whether the 1120 1.1 mrg step is nonzero or not -- we would not otherwise be able to recognize 1121 1.1 mrg whether the value accessed by reads comes from the OFFSET-th iteration 1122 1.1 mrg or the previous one. */ 1123 1.1 mrg if (has_write && comp->comp_step == RS_ANY) 1124 1.1 mrg return false; 1125 1.1 mrg 1126 1.1 mrg return true; 1127 1.1 mrg } 1128 1.1 mrg 1129 1.1 mrg /* Check the conditions on references inside each of components COMPS, 1130 1.1 mrg and remove the unsuitable components from the list. The new list 1131 1.1 mrg of components is returned. The conditions are described in 2) at 1132 1.1 mrg the beginning of this file. */ 1133 1.1 mrg 1134 1.1 mrg struct component * 1135 1.1 mrg pcom_worker::filter_suitable_components (struct component *comps) 1136 1.1 mrg { 1137 1.1 mrg struct component **comp, *act; 1138 1.1 mrg 1139 1.1 mrg for (comp = &comps; *comp; ) 1140 1.1 mrg { 1141 1.1 mrg act = *comp; 1142 1.1 mrg if (suitable_component_p (act)) 1143 1.1 mrg comp = &act->next; 1144 1.1 mrg else 1145 1.1 mrg { 1146 1.1 mrg dref ref; 1147 1.1 mrg unsigned i; 1148 1.1 mrg 1149 1.1 mrg *comp = act->next; 1150 1.1 mrg FOR_EACH_VEC_ELT (act->refs, i, ref) 1151 1.1 mrg free (ref); 1152 1.1 mrg delete act; 1153 1.1 mrg } 1154 1.1 mrg } 1155 1.1 mrg 1156 1.1 mrg return comps; 1157 1.1 mrg } 1158 1.1 mrg 1159 1.1 mrg /* Compares two drefs A and B by their offset and position. Callback for 1160 1.1 mrg qsort. */ 1161 1.1 mrg 1162 1.1 mrg static int 1163 1.1 mrg order_drefs (const void *a, const void *b) 1164 1.1 mrg { 1165 1.1 mrg const dref *const da = (const dref *) a; 1166 1.1 mrg const dref *const db = (const dref *) b; 1167 1.1 mrg int offcmp = wi::cmps ((*da)->offset, (*db)->offset); 1168 1.1 mrg 1169 1.1 mrg if (offcmp != 0) 1170 1.1 mrg return offcmp; 1171 1.1 mrg 1172 1.1 mrg return (*da)->pos - (*db)->pos; 1173 1.1 mrg } 1174 1.1 mrg 1175 1.1 mrg /* Compares two drefs A and B by their position. Callback for qsort. */ 1176 1.1 mrg 1177 1.1 mrg static int 1178 1.1 mrg order_drefs_by_pos (const void *a, const void *b) 1179 1.1 mrg { 1180 1.1 mrg const dref *const da = (const dref *) a; 1181 1.1 mrg const dref *const db = (const dref *) b; 1182 1.1 mrg 1183 1.1 mrg return (*da)->pos - (*db)->pos; 1184 1.1 mrg } 1185 1.1 mrg 1186 1.1 mrg /* Returns root of the CHAIN. */ 1187 1.1 mrg 1188 1.1 mrg static inline dref 1189 1.1 mrg get_chain_root (chain_p chain) 1190 1.1 mrg { 1191 1.1 mrg return chain->refs[0]; 1192 1.1 mrg } 1193 1.1 mrg 1194 1.1 mrg /* Given CHAIN, returns the last write ref at DISTANCE, or NULL if it doesn't 1195 1.1 mrg exist. */ 1196 1.1 mrg 1197 1.1 mrg static inline dref 1198 1.1 mrg get_chain_last_write_at (chain_p chain, unsigned distance) 1199 1.1 mrg { 1200 1.1 mrg for (unsigned i = chain->refs.length (); i > 0; i--) 1201 1.1 mrg if (DR_IS_WRITE (chain->refs[i - 1]->ref) 1202 1.1 mrg && distance == chain->refs[i - 1]->distance) 1203 1.1 mrg return chain->refs[i - 1]; 1204 1.1 mrg 1205 1.1 mrg return NULL; 1206 1.1 mrg } 1207 1.1 mrg 1208 1.1 mrg /* Given CHAIN, returns the last write ref with the same distance before load 1209 1.1 mrg at index LOAD_IDX, or NULL if it doesn't exist. */ 1210 1.1 mrg 1211 1.1 mrg static inline dref 1212 1.1 mrg get_chain_last_write_before_load (chain_p chain, unsigned load_idx) 1213 1.1 mrg { 1214 1.1 mrg gcc_assert (load_idx < chain->refs.length ()); 1215 1.1 mrg 1216 1.1 mrg unsigned distance = chain->refs[load_idx]->distance; 1217 1.1 mrg 1218 1.1 mrg for (unsigned i = load_idx; i > 0; i--) 1219 1.1 mrg if (DR_IS_WRITE (chain->refs[i - 1]->ref) 1220 1.1 mrg && distance == chain->refs[i - 1]->distance) 1221 1.1 mrg return chain->refs[i - 1]; 1222 1.1 mrg 1223 1.1 mrg return NULL; 1224 1.1 mrg } 1225 1.1 mrg 1226 1.1 mrg /* Adds REF to the chain CHAIN. */ 1227 1.1 mrg 1228 1.1 mrg static void 1229 1.1 mrg add_ref_to_chain (chain_p chain, dref ref) 1230 1.1 mrg { 1231 1.1 mrg dref root = get_chain_root (chain); 1232 1.1 mrg 1233 1.1 mrg gcc_assert (wi::les_p (root->offset, ref->offset)); 1234 1.1 mrg widest_int dist = ref->offset - root->offset; 1235 1.1 mrg gcc_assert (wi::fits_uhwi_p (dist)); 1236 1.1 mrg 1237 1.1 mrg chain->refs.safe_push (ref); 1238 1.1 mrg 1239 1.1 mrg ref->distance = dist.to_uhwi (); 1240 1.1 mrg 1241 1.1 mrg if (ref->distance >= chain->length) 1242 1.1 mrg { 1243 1.1 mrg chain->length = ref->distance; 1244 1.1 mrg chain->has_max_use_after = false; 1245 1.1 mrg } 1246 1.1 mrg 1247 1.1 mrg /* Promote this chain to CT_STORE_STORE if it has multiple stores. */ 1248 1.1 mrg if (DR_IS_WRITE (ref->ref)) 1249 1.1 mrg chain->type = CT_STORE_STORE; 1250 1.1 mrg 1251 1.1 mrg /* Don't set the flag for store-store chain since there is no use. */ 1252 1.1 mrg if (chain->type != CT_STORE_STORE 1253 1.1 mrg && ref->distance == chain->length 1254 1.1 mrg && ref->pos > root->pos) 1255 1.1 mrg chain->has_max_use_after = true; 1256 1.1 mrg 1257 1.1 mrg chain->all_always_accessed &= ref->always_accessed; 1258 1.1 mrg } 1259 1.1 mrg 1260 1.1 mrg /* Returns the chain for invariant component COMP. */ 1261 1.1 mrg 1262 1.1 mrg static chain_p 1263 1.1 mrg make_invariant_chain (struct component *comp) 1264 1.1 mrg { 1265 1.1 mrg chain_p chain = new struct chain (CT_INVARIANT); 1266 1.1 mrg unsigned i; 1267 1.1 mrg dref ref; 1268 1.1 mrg 1269 1.1 mrg chain->all_always_accessed = true; 1270 1.1 mrg 1271 1.1 mrg FOR_EACH_VEC_ELT (comp->refs, i, ref) 1272 1.1 mrg { 1273 1.1 mrg chain->refs.safe_push (ref); 1274 1.1 mrg chain->all_always_accessed &= ref->always_accessed; 1275 1.1 mrg } 1276 1.1 mrg 1277 1.1 mrg chain->inits = vNULL; 1278 1.1 mrg chain->finis = vNULL; 1279 1.1 mrg 1280 1.1 mrg return chain; 1281 1.1 mrg } 1282 1.1 mrg 1283 1.1 mrg /* Make a new chain of type TYPE rooted at REF. */ 1284 1.1 mrg 1285 1.1 mrg static chain_p 1286 1.1 mrg make_rooted_chain (dref ref, enum chain_type type) 1287 1.1 mrg { 1288 1.1 mrg chain_p chain = new struct chain (type); 1289 1.1 mrg 1290 1.1 mrg chain->refs.safe_push (ref); 1291 1.1 mrg chain->all_always_accessed = ref->always_accessed; 1292 1.1 mrg ref->distance = 0; 1293 1.1 mrg 1294 1.1 mrg chain->inits = vNULL; 1295 1.1 mrg chain->finis = vNULL; 1296 1.1 mrg 1297 1.1 mrg return chain; 1298 1.1 mrg } 1299 1.1 mrg 1300 1.1 mrg /* Returns true if CHAIN is not trivial. */ 1301 1.1 mrg 1302 1.1 mrg static bool 1303 1.1 mrg nontrivial_chain_p (chain_p chain) 1304 1.1 mrg { 1305 1.1 mrg return chain != NULL && chain->refs.length () > 1; 1306 1.1 mrg } 1307 1.1 mrg 1308 1.1 mrg /* Returns the ssa name that contains the value of REF, or NULL_TREE if there 1309 1.1 mrg is no such name. */ 1310 1.1 mrg 1311 1.1 mrg static tree 1312 1.1 mrg name_for_ref (dref ref) 1313 1.1 mrg { 1314 1.1 mrg tree name; 1315 1.1 mrg 1316 1.1 mrg if (is_gimple_assign (ref->stmt)) 1317 1.1 mrg { 1318 1.1 mrg if (!ref->ref || DR_IS_READ (ref->ref)) 1319 1.1 mrg name = gimple_assign_lhs (ref->stmt); 1320 1.1 mrg else 1321 1.1 mrg name = gimple_assign_rhs1 (ref->stmt); 1322 1.1 mrg } 1323 1.1 mrg else 1324 1.1 mrg name = PHI_RESULT (ref->stmt); 1325 1.1 mrg 1326 1.1 mrg return (TREE_CODE (name) == SSA_NAME ? name : NULL_TREE); 1327 1.1 mrg } 1328 1.1 mrg 1329 1.1 mrg /* Returns true if REF is a valid initializer for ROOT with given DISTANCE (in 1330 1.1 mrg iterations of the innermost enclosing loop). */ 1331 1.1 mrg 1332 1.1 mrg bool 1333 1.1 mrg pcom_worker::valid_initializer_p (struct data_reference *ref, unsigned distance, 1334 1.1 mrg struct data_reference *root) 1335 1.1 mrg { 1336 1.1 mrg aff_tree diff, base, step; 1337 1.1 mrg poly_widest_int off; 1338 1.1 mrg 1339 1.1 mrg /* Both REF and ROOT must be accessing the same object. */ 1340 1.1 mrg if (!operand_equal_p (DR_BASE_ADDRESS (ref), DR_BASE_ADDRESS (root), 0)) 1341 1.1 mrg return false; 1342 1.1 mrg 1343 1.1 mrg /* The initializer is defined outside of loop, hence its address must be 1344 1.1 mrg invariant inside the loop. */ 1345 1.1 mrg gcc_assert (integer_zerop (DR_STEP (ref))); 1346 1.1 mrg 1347 1.1 mrg /* If the address of the reference is invariant, initializer must access 1348 1.1 mrg exactly the same location. */ 1349 1.1 mrg if (integer_zerop (DR_STEP (root))) 1350 1.1 mrg return (operand_equal_p (DR_OFFSET (ref), DR_OFFSET (root), 0) 1351 1.1 mrg && operand_equal_p (DR_INIT (ref), DR_INIT (root), 0)); 1352 1.1 mrg 1353 1.1 mrg /* Verify that this index of REF is equal to the root's index at 1354 1.1 mrg -DISTANCE-th iteration. */ 1355 1.1 mrg aff_combination_dr_offset (root, &diff); 1356 1.1 mrg aff_combination_dr_offset (ref, &base); 1357 1.1 mrg aff_combination_scale (&base, -1); 1358 1.1 mrg aff_combination_add (&diff, &base); 1359 1.1 mrg 1360 1.1 mrg tree_to_aff_combination_expand (DR_STEP (root), TREE_TYPE (DR_STEP (root)), 1361 1.1 mrg &step, &m_cache); 1362 1.1 mrg if (!aff_combination_constant_multiple_p (&diff, &step, &off)) 1363 1.1 mrg return false; 1364 1.1 mrg 1365 1.1 mrg if (maybe_ne (off, distance)) 1366 1.1 mrg return false; 1367 1.1 mrg 1368 1.1 mrg return true; 1369 1.1 mrg } 1370 1.1 mrg 1371 1.1 mrg /* Finds looparound phi node of loop that copies the value of REF, and if its 1372 1.1 mrg initial value is correct (equal to initial value of REF shifted by one 1373 1.1 mrg iteration), returns the phi node. Otherwise, NULL_TREE is returned. ROOT 1374 1.1 mrg is the root of the current chain. */ 1375 1.1 mrg 1376 1.1 mrg gphi * 1377 1.1 mrg pcom_worker::find_looparound_phi (dref ref, dref root) 1378 1.1 mrg { 1379 1.1 mrg tree name, init, init_ref; 1380 1.1 mrg gimple *init_stmt; 1381 1.1 mrg edge latch = loop_latch_edge (m_loop); 1382 1.1 mrg struct data_reference init_dr; 1383 1.1 mrg gphi_iterator psi; 1384 1.1 mrg 1385 1.1 mrg if (is_gimple_assign (ref->stmt)) 1386 1.1 mrg { 1387 1.1 mrg if (DR_IS_READ (ref->ref)) 1388 1.1 mrg name = gimple_assign_lhs (ref->stmt); 1389 1.1 mrg else 1390 1.1 mrg name = gimple_assign_rhs1 (ref->stmt); 1391 1.1 mrg } 1392 1.1 mrg else 1393 1.1 mrg name = PHI_RESULT (ref->stmt); 1394 1.1 mrg if (!name) 1395 1.1 mrg return NULL; 1396 1.1 mrg 1397 1.1 mrg tree entry_vuse = NULL_TREE; 1398 1.1 mrg gphi *phi = NULL; 1399 1.1 mrg for (psi = gsi_start_phis (m_loop->header); !gsi_end_p (psi); gsi_next (&psi)) 1400 1.1 mrg { 1401 1.1 mrg gphi *p = psi.phi (); 1402 1.1 mrg if (PHI_ARG_DEF_FROM_EDGE (p, latch) == name) 1403 1.1 mrg phi = p; 1404 1.1 mrg else if (virtual_operand_p (gimple_phi_result (p))) 1405 1.1 mrg entry_vuse = PHI_ARG_DEF_FROM_EDGE (p, loop_preheader_edge (m_loop)); 1406 1.1 mrg if (phi && entry_vuse) 1407 1.1 mrg break; 1408 1.1 mrg } 1409 1.1 mrg if (!phi || !entry_vuse) 1410 1.1 mrg return NULL; 1411 1.1 mrg 1412 1.1 mrg init = PHI_ARG_DEF_FROM_EDGE (phi, loop_preheader_edge (m_loop)); 1413 1.1 mrg if (TREE_CODE (init) != SSA_NAME) 1414 1.1 mrg return NULL; 1415 1.1 mrg init_stmt = SSA_NAME_DEF_STMT (init); 1416 1.1 mrg if (gimple_code (init_stmt) != GIMPLE_ASSIGN) 1417 1.1 mrg return NULL; 1418 1.1 mrg gcc_assert (gimple_assign_lhs (init_stmt) == init); 1419 1.1 mrg 1420 1.1 mrg init_ref = gimple_assign_rhs1 (init_stmt); 1421 1.1 mrg if (!REFERENCE_CLASS_P (init_ref) 1422 1.1 mrg && !DECL_P (init_ref)) 1423 1.1 mrg return NULL; 1424 1.1 mrg 1425 1.1 mrg /* Analyze the behavior of INIT_REF with respect to LOOP (innermost 1426 1.1 mrg loop enclosing PHI). */ 1427 1.1 mrg memset (&init_dr, 0, sizeof (struct data_reference)); 1428 1.1 mrg DR_REF (&init_dr) = init_ref; 1429 1.1 mrg DR_STMT (&init_dr) = phi; 1430 1.1 mrg if (!dr_analyze_innermost (&DR_INNERMOST (&init_dr), init_ref, m_loop, 1431 1.1 mrg init_stmt)) 1432 1.1 mrg return NULL; 1433 1.1 mrg 1434 1.1 mrg if (!valid_initializer_p (&init_dr, ref->distance + 1, root->ref)) 1435 1.1 mrg return NULL; 1436 1.1 mrg 1437 1.1 mrg /* Make sure nothing clobbers the location we re-use the initial value 1438 1.1 mrg from. */ 1439 1.1 mrg if (entry_vuse != gimple_vuse (init_stmt)) 1440 1.1 mrg { 1441 1.1 mrg ao_ref ref; 1442 1.1 mrg ao_ref_init (&ref, init_ref); 1443 1.1 mrg unsigned limit = param_sccvn_max_alias_queries_per_access; 1444 1.1 mrg tree vdef = entry_vuse; 1445 1.1 mrg do 1446 1.1 mrg { 1447 1.1 mrg gimple *def = SSA_NAME_DEF_STMT (vdef); 1448 1.1 mrg if (limit-- == 0 || gimple_code (def) == GIMPLE_PHI) 1449 1.1 mrg return NULL; 1450 1.1 mrg if (stmt_may_clobber_ref_p_1 (def, &ref)) 1451 1.1 mrg /* When the stmt is an assign to init_ref we could in theory 1452 1.1 mrg use its RHS for the initial value of the looparound PHI 1453 1.1 mrg we replace in prepare_initializers_chain, but we have 1454 1.1 mrg no convenient place to store this info at the moment. */ 1455 1.1 mrg return NULL; 1456 1.1 mrg vdef = gimple_vuse (def); 1457 1.1 mrg } 1458 1.1 mrg while (vdef != gimple_vuse (init_stmt)); 1459 1.1 mrg } 1460 1.1 mrg 1461 1.1 mrg return phi; 1462 1.1 mrg } 1463 1.1 mrg 1464 1.1 mrg /* Adds a reference for the looparound copy of REF in PHI to CHAIN. */ 1465 1.1 mrg 1466 1.1 mrg static void 1467 1.1 mrg insert_looparound_copy (chain_p chain, dref ref, gphi *phi) 1468 1.1 mrg { 1469 1.1 mrg dref nw = XCNEW (class dref_d), aref; 1470 1.1 mrg unsigned i; 1471 1.1 mrg 1472 1.1 mrg nw->stmt = phi; 1473 1.1 mrg nw->distance = ref->distance + 1; 1474 1.1 mrg nw->always_accessed = 1; 1475 1.1 mrg 1476 1.1 mrg FOR_EACH_VEC_ELT (chain->refs, i, aref) 1477 1.1 mrg if (aref->distance >= nw->distance) 1478 1.1 mrg break; 1479 1.1 mrg chain->refs.safe_insert (i, nw); 1480 1.1 mrg 1481 1.1 mrg if (nw->distance > chain->length) 1482 1.1 mrg { 1483 1.1 mrg chain->length = nw->distance; 1484 1.1 mrg chain->has_max_use_after = false; 1485 1.1 mrg } 1486 1.1 mrg } 1487 1.1 mrg 1488 1.1 mrg /* For references in CHAIN that are copied around the loop (created previously 1489 1.1 mrg by PRE, or by user), add the results of such copies to the chain. This 1490 1.1 mrg enables us to remove the copies by unrolling, and may need less registers 1491 1.1 mrg (also, it may allow us to combine chains together). */ 1492 1.1 mrg 1493 1.1 mrg void 1494 1.1 mrg pcom_worker::add_looparound_copies (chain_p chain) 1495 1.1 mrg { 1496 1.1 mrg unsigned i; 1497 1.1 mrg dref ref, root = get_chain_root (chain); 1498 1.1 mrg gphi *phi; 1499 1.1 mrg 1500 1.1 mrg if (chain->type == CT_STORE_STORE) 1501 1.1 mrg return; 1502 1.1 mrg 1503 1.1 mrg FOR_EACH_VEC_ELT (chain->refs, i, ref) 1504 1.1 mrg { 1505 1.1 mrg phi = find_looparound_phi (ref, root); 1506 1.1 mrg if (!phi) 1507 1.1 mrg continue; 1508 1.1 mrg 1509 1.1 mrg bitmap_set_bit (m_looparound_phis, SSA_NAME_VERSION (PHI_RESULT (phi))); 1510 1.1 mrg insert_looparound_copy (chain, ref, phi); 1511 1.1 mrg } 1512 1.1 mrg } 1513 1.1 mrg 1514 1.1 mrg /* Find roots of the values and determine distances in the component COMP. 1515 1.1 mrg The references are redistributed into chains. */ 1516 1.1 mrg 1517 1.1 mrg void 1518 1.1 mrg pcom_worker::determine_roots_comp (struct component *comp) 1519 1.1 mrg { 1520 1.1 mrg unsigned i; 1521 1.1 mrg dref a; 1522 1.1 mrg chain_p chain = NULL; 1523 1.1 mrg widest_int last_ofs = 0; 1524 1.1 mrg enum chain_type type; 1525 1.1 mrg 1526 1.1 mrg /* Invariants are handled specially. */ 1527 1.1 mrg if (comp->comp_step == RS_INVARIANT) 1528 1.1 mrg { 1529 1.1 mrg chain = make_invariant_chain (comp); 1530 1.1 mrg m_chains.safe_push (chain); 1531 1.1 mrg return; 1532 1.1 mrg } 1533 1.1 mrg 1534 1.1 mrg /* Trivial component. */ 1535 1.1 mrg if (comp->refs.length () <= 1) 1536 1.1 mrg { 1537 1.1 mrg if (comp->refs.length () == 1) 1538 1.1 mrg { 1539 1.1 mrg free (comp->refs[0]); 1540 1.1 mrg comp->refs.truncate (0); 1541 1.1 mrg } 1542 1.1 mrg return; 1543 1.1 mrg } 1544 1.1 mrg 1545 1.1 mrg comp->refs.qsort (order_drefs); 1546 1.1 mrg 1547 1.1 mrg /* For Store-Store chain, we only support load if it is dominated by a 1548 1.1 mrg store statement in the same iteration of loop. */ 1549 1.1 mrg if (comp->eliminate_store_p) 1550 1.1 mrg for (a = NULL, i = 0; i < comp->refs.length (); i++) 1551 1.1 mrg { 1552 1.1 mrg if (DR_IS_WRITE (comp->refs[i]->ref)) 1553 1.1 mrg a = comp->refs[i]; 1554 1.1 mrg else if (a == NULL || a->offset != comp->refs[i]->offset) 1555 1.1 mrg { 1556 1.1 mrg /* If there is load that is not dominated by a store in the 1557 1.1 mrg same iteration of loop, clear the flag so no Store-Store 1558 1.1 mrg chain is generated for this component. */ 1559 1.1 mrg comp->eliminate_store_p = false; 1560 1.1 mrg break; 1561 1.1 mrg } 1562 1.1 mrg } 1563 1.1 mrg 1564 1.1 mrg /* Determine roots and create chains for components. */ 1565 1.1 mrg FOR_EACH_VEC_ELT (comp->refs, i, a) 1566 1.1 mrg { 1567 1.1 mrg if (!chain 1568 1.1 mrg || (chain->type == CT_LOAD && DR_IS_WRITE (a->ref)) 1569 1.1 mrg || (!comp->eliminate_store_p && DR_IS_WRITE (a->ref)) 1570 1.1 mrg || wi::leu_p (MAX_DISTANCE, a->offset - last_ofs)) 1571 1.1 mrg { 1572 1.1 mrg if (nontrivial_chain_p (chain)) 1573 1.1 mrg { 1574 1.1 mrg add_looparound_copies (chain); 1575 1.1 mrg m_chains.safe_push (chain); 1576 1.1 mrg } 1577 1.1 mrg else 1578 1.1 mrg release_chain (chain); 1579 1.1 mrg 1580 1.1 mrg /* Determine type of the chain. If the root reference is a load, 1581 1.1 mrg this can only be a CT_LOAD chain; other chains are intialized 1582 1.1 mrg to CT_STORE_LOAD and might be promoted to CT_STORE_STORE when 1583 1.1 mrg new reference is added. */ 1584 1.1 mrg type = DR_IS_READ (a->ref) ? CT_LOAD : CT_STORE_LOAD; 1585 1.1 mrg chain = make_rooted_chain (a, type); 1586 1.1 mrg last_ofs = a->offset; 1587 1.1 mrg continue; 1588 1.1 mrg } 1589 1.1 mrg 1590 1.1 mrg add_ref_to_chain (chain, a); 1591 1.1 mrg } 1592 1.1 mrg 1593 1.1 mrg if (nontrivial_chain_p (chain)) 1594 1.1 mrg { 1595 1.1 mrg add_looparound_copies (chain); 1596 1.1 mrg m_chains.safe_push (chain); 1597 1.1 mrg } 1598 1.1 mrg else 1599 1.1 mrg release_chain (chain); 1600 1.1 mrg } 1601 1.1 mrg 1602 1.1 mrg /* Find roots of the values and determine distances in components COMPS, and 1603 1.1 mrg separates the references to chains. */ 1604 1.1 mrg 1605 1.1 mrg void 1606 1.1 mrg pcom_worker::determine_roots (struct component *comps) 1607 1.1 mrg { 1608 1.1 mrg struct component *comp; 1609 1.1 mrg 1610 1.1 mrg for (comp = comps; comp; comp = comp->next) 1611 1.1 mrg determine_roots_comp (comp); 1612 1.1 mrg } 1613 1.1 mrg 1614 1.1 mrg /* Replace the reference in statement STMT with temporary variable 1615 1.1 mrg NEW_TREE. If SET is true, NEW_TREE is instead initialized to the value of 1616 1.1 mrg the reference in the statement. IN_LHS is true if the reference 1617 1.1 mrg is in the lhs of STMT, false if it is in rhs. */ 1618 1.1 mrg 1619 1.1 mrg static void 1620 1.1 mrg replace_ref_with (gimple *stmt, tree new_tree, bool set, bool in_lhs) 1621 1.1 mrg { 1622 1.1 mrg tree val; 1623 1.1 mrg gassign *new_stmt; 1624 1.1 mrg gimple_stmt_iterator bsi, psi; 1625 1.1 mrg 1626 1.1 mrg if (gimple_code (stmt) == GIMPLE_PHI) 1627 1.1 mrg { 1628 1.1 mrg gcc_assert (!in_lhs && !set); 1629 1.1 mrg 1630 1.1 mrg val = PHI_RESULT (stmt); 1631 1.1 mrg bsi = gsi_after_labels (gimple_bb (stmt)); 1632 1.1 mrg psi = gsi_for_stmt (stmt); 1633 1.1 mrg remove_phi_node (&psi, false); 1634 1.1 mrg 1635 1.1 mrg /* Turn the phi node into GIMPLE_ASSIGN. */ 1636 1.1 mrg new_stmt = gimple_build_assign (val, new_tree); 1637 1.1 mrg gsi_insert_before (&bsi, new_stmt, GSI_NEW_STMT); 1638 1.1 mrg return; 1639 1.1 mrg } 1640 1.1 mrg 1641 1.1 mrg /* Since the reference is of gimple_reg type, it should only 1642 1.1 mrg appear as lhs or rhs of modify statement. */ 1643 1.1 mrg gcc_assert (is_gimple_assign (stmt)); 1644 1.1 mrg 1645 1.1 mrg bsi = gsi_for_stmt (stmt); 1646 1.1 mrg 1647 1.1 mrg /* If we do not need to initialize NEW_TREE, just replace the use of OLD. */ 1648 1.1 mrg if (!set) 1649 1.1 mrg { 1650 1.1 mrg gcc_assert (!in_lhs); 1651 1.1 mrg gimple_assign_set_rhs_from_tree (&bsi, new_tree); 1652 1.1 mrg stmt = gsi_stmt (bsi); 1653 1.1 mrg update_stmt (stmt); 1654 1.1 mrg return; 1655 1.1 mrg } 1656 1.1 mrg 1657 1.1 mrg if (in_lhs) 1658 1.1 mrg { 1659 1.1 mrg /* We have statement 1660 1.1 mrg 1661 1.1 mrg OLD = VAL 1662 1.1 mrg 1663 1.1 mrg If OLD is a memory reference, then VAL is gimple_val, and we transform 1664 1.1 mrg this to 1665 1.1 mrg 1666 1.1 mrg OLD = VAL 1667 1.1 mrg NEW = VAL 1668 1.1 mrg 1669 1.1 mrg Otherwise, we are replacing a combination chain, 1670 1.1 mrg VAL is the expression that performs the combination, and OLD is an 1671 1.1 mrg SSA name. In this case, we transform the assignment to 1672 1.1 mrg 1673 1.1 mrg OLD = VAL 1674 1.1 mrg NEW = OLD 1675 1.1 mrg 1676 1.1 mrg */ 1677 1.1 mrg 1678 1.1 mrg val = gimple_assign_lhs (stmt); 1679 1.1 mrg if (TREE_CODE (val) != SSA_NAME) 1680 1.1 mrg { 1681 1.1 mrg val = gimple_assign_rhs1 (stmt); 1682 1.1 mrg gcc_assert (gimple_assign_single_p (stmt)); 1683 1.1 mrg if (TREE_CLOBBER_P (val)) 1684 1.1 mrg val = get_or_create_ssa_default_def (cfun, SSA_NAME_VAR (new_tree)); 1685 1.1 mrg else 1686 1.1 mrg gcc_assert (gimple_assign_copy_p (stmt)); 1687 1.1 mrg } 1688 1.1 mrg } 1689 1.1 mrg else 1690 1.1 mrg { 1691 1.1 mrg /* VAL = OLD 1692 1.1 mrg 1693 1.1 mrg is transformed to 1694 1.1 mrg 1695 1.1 mrg VAL = OLD 1696 1.1 mrg NEW = VAL */ 1697 1.1 mrg 1698 1.1 mrg val = gimple_assign_lhs (stmt); 1699 1.1 mrg } 1700 1.1 mrg 1701 1.1 mrg new_stmt = gimple_build_assign (new_tree, unshare_expr (val)); 1702 1.1 mrg gsi_insert_after (&bsi, new_stmt, GSI_NEW_STMT); 1703 1.1 mrg } 1704 1.1 mrg 1705 1.1 mrg /* Returns a memory reference to DR in the (NITERS + ITER)-th iteration 1706 1.1 mrg of the loop it was analyzed in. Append init stmts to STMTS. */ 1707 1.1 mrg 1708 1.1 mrg static tree 1709 1.1 mrg ref_at_iteration (data_reference_p dr, int iter, 1710 1.1 mrg gimple_seq *stmts, tree niters = NULL_TREE) 1711 1.1 mrg { 1712 1.1 mrg tree off = DR_OFFSET (dr); 1713 1.1 mrg tree coff = DR_INIT (dr); 1714 1.1 mrg tree ref = DR_REF (dr); 1715 1.1 mrg enum tree_code ref_code = ERROR_MARK; 1716 1.1 mrg tree ref_type = NULL_TREE; 1717 1.1 mrg tree ref_op1 = NULL_TREE; 1718 1.1 mrg tree ref_op2 = NULL_TREE; 1719 1.1 mrg tree new_offset; 1720 1.1 mrg 1721 1.1 mrg if (iter != 0) 1722 1.1 mrg { 1723 1.1 mrg new_offset = size_binop (MULT_EXPR, DR_STEP (dr), ssize_int (iter)); 1724 1.1 mrg if (TREE_CODE (new_offset) == INTEGER_CST) 1725 1.1 mrg coff = size_binop (PLUS_EXPR, coff, new_offset); 1726 1.1 mrg else 1727 1.1 mrg off = size_binop (PLUS_EXPR, off, new_offset); 1728 1.1 mrg } 1729 1.1 mrg 1730 1.1 mrg if (niters != NULL_TREE) 1731 1.1 mrg { 1732 1.1 mrg niters = fold_convert (ssizetype, niters); 1733 1.1 mrg new_offset = size_binop (MULT_EXPR, DR_STEP (dr), niters); 1734 1.1 mrg if (TREE_CODE (niters) == INTEGER_CST) 1735 1.1 mrg coff = size_binop (PLUS_EXPR, coff, new_offset); 1736 1.1 mrg else 1737 1.1 mrg off = size_binop (PLUS_EXPR, off, new_offset); 1738 1.1 mrg } 1739 1.1 mrg 1740 1.1 mrg /* While data-ref analysis punts on bit offsets it still handles 1741 1.1 mrg bitfield accesses at byte boundaries. Cope with that. Note that 1742 1.1 mrg if the bitfield object also starts at a byte-boundary we can simply 1743 1.1 mrg replicate the COMPONENT_REF, but we have to subtract the component's 1744 1.1 mrg byte-offset from the MEM_REF address first. 1745 1.1 mrg Otherwise we simply build a BIT_FIELD_REF knowing that the bits 1746 1.1 mrg start at offset zero. */ 1747 1.1 mrg if (TREE_CODE (ref) == COMPONENT_REF 1748 1.1 mrg && DECL_BIT_FIELD (TREE_OPERAND (ref, 1))) 1749 1.1 mrg { 1750 1.1 mrg unsigned HOST_WIDE_INT boff; 1751 1.1 mrg tree field = TREE_OPERAND (ref, 1); 1752 1.1 mrg tree offset = component_ref_field_offset (ref); 1753 1.1 mrg ref_type = TREE_TYPE (ref); 1754 1.1 mrg boff = tree_to_uhwi (DECL_FIELD_BIT_OFFSET (field)); 1755 1.1 mrg /* This can occur in Ada. See the comment in get_bit_range. */ 1756 1.1 mrg if (boff % BITS_PER_UNIT != 0 1757 1.1 mrg || !tree_fits_uhwi_p (offset)) 1758 1.1 mrg { 1759 1.1 mrg ref_code = BIT_FIELD_REF; 1760 1.1 mrg ref_op1 = DECL_SIZE (field); 1761 1.1 mrg ref_op2 = bitsize_zero_node; 1762 1.1 mrg } 1763 1.1 mrg else 1764 1.1 mrg { 1765 1.1 mrg boff >>= LOG2_BITS_PER_UNIT; 1766 1.1 mrg boff += tree_to_uhwi (offset); 1767 1.1 mrg coff = size_binop (MINUS_EXPR, coff, ssize_int (boff)); 1768 1.1 mrg ref_code = COMPONENT_REF; 1769 1.1 mrg ref_op1 = field; 1770 1.1 mrg ref_op2 = TREE_OPERAND (ref, 2); 1771 1.1 mrg ref = TREE_OPERAND (ref, 0); 1772 1.1 mrg } 1773 1.1 mrg } 1774 1.1 mrg /* We may not associate the constant offset across the pointer plus 1775 1.1 mrg expression because that might form a pointer to before the object 1776 1.1 mrg then. But for some cases we can retain that to allow tree_could_trap_p 1777 1.1 mrg to return false - see gcc.dg/tree-ssa/predcom-1.c */ 1778 1.1 mrg tree addr, alias_ptr; 1779 1.1 mrg if (integer_zerop (off)) 1780 1.1 mrg { 1781 1.1 mrg alias_ptr = fold_convert (reference_alias_ptr_type (ref), coff); 1782 1.1 mrg addr = DR_BASE_ADDRESS (dr); 1783 1.1 mrg } 1784 1.1 mrg else 1785 1.1 mrg { 1786 1.1 mrg alias_ptr = build_zero_cst (reference_alias_ptr_type (ref)); 1787 1.1 mrg off = size_binop (PLUS_EXPR, off, coff); 1788 1.1 mrg addr = fold_build_pointer_plus (DR_BASE_ADDRESS (dr), off); 1789 1.1 mrg } 1790 1.1 mrg addr = force_gimple_operand_1 (unshare_expr (addr), stmts, 1791 1.1 mrg is_gimple_mem_ref_addr, NULL_TREE); 1792 1.1 mrg tree type = build_aligned_type (TREE_TYPE (ref), 1793 1.1 mrg get_object_alignment (ref)); 1794 1.1 mrg ref = build2 (MEM_REF, type, addr, alias_ptr); 1795 1.1 mrg if (ref_type) 1796 1.1 mrg ref = build3 (ref_code, ref_type, ref, ref_op1, ref_op2); 1797 1.1 mrg return ref; 1798 1.1 mrg } 1799 1.1 mrg 1800 1.1 mrg /* Get the initialization expression for the INDEX-th temporary variable 1801 1.1 mrg of CHAIN. */ 1802 1.1 mrg 1803 1.1 mrg static tree 1804 1.1 mrg get_init_expr (chain_p chain, unsigned index) 1805 1.1 mrg { 1806 1.1 mrg if (chain->type == CT_COMBINATION) 1807 1.1 mrg { 1808 1.1 mrg tree e1 = get_init_expr (chain->ch1, index); 1809 1.1 mrg tree e2 = get_init_expr (chain->ch2, index); 1810 1.1 mrg 1811 1.1 mrg return fold_build2 (chain->op, chain->rslt_type, e1, e2); 1812 1.1 mrg } 1813 1.1 mrg else 1814 1.1 mrg return chain->inits[index]; 1815 1.1 mrg } 1816 1.1 mrg 1817 1.1 mrg /* Returns a new temporary variable used for the I-th variable carrying 1818 1.1 mrg value of REF. The variable's uid is marked in TMP_VARS. */ 1819 1.1 mrg 1820 1.1 mrg static tree 1821 1.1 mrg predcom_tmp_var (tree ref, unsigned i, bitmap tmp_vars) 1822 1.1 mrg { 1823 1.1 mrg tree type = TREE_TYPE (ref); 1824 1.1 mrg /* We never access the components of the temporary variable in predictive 1825 1.1 mrg commoning. */ 1826 1.1 mrg tree var = create_tmp_reg (type, get_lsm_tmp_name (ref, i)); 1827 1.1 mrg bitmap_set_bit (tmp_vars, DECL_UID (var)); 1828 1.1 mrg return var; 1829 1.1 mrg } 1830 1.1 mrg 1831 1.1 mrg /* Creates the variables for CHAIN, as well as phi nodes for them and 1832 1.1 mrg initialization on entry to LOOP. Uids of the newly created 1833 1.1 mrg temporary variables are marked in TMP_VARS. */ 1834 1.1 mrg 1835 1.1 mrg static void 1836 1.1 mrg initialize_root_vars (class loop *loop, chain_p chain, bitmap tmp_vars) 1837 1.1 mrg { 1838 1.1 mrg unsigned i; 1839 1.1 mrg unsigned n = chain->length; 1840 1.1 mrg dref root = get_chain_root (chain); 1841 1.1 mrg bool reuse_first = !chain->has_max_use_after; 1842 1.1 mrg tree ref, init, var, next; 1843 1.1 mrg gphi *phi; 1844 1.1 mrg gimple_seq stmts; 1845 1.1 mrg edge entry = loop_preheader_edge (loop), latch = loop_latch_edge (loop); 1846 1.1 mrg 1847 1.1 mrg /* If N == 0, then all the references are within the single iteration. And 1848 1.1 mrg since this is an nonempty chain, reuse_first cannot be true. */ 1849 1.1 mrg gcc_assert (n > 0 || !reuse_first); 1850 1.1 mrg 1851 1.1 mrg chain->vars.create (n + 1); 1852 1.1 mrg 1853 1.1 mrg if (chain->type == CT_COMBINATION) 1854 1.1 mrg ref = gimple_assign_lhs (root->stmt); 1855 1.1 mrg else 1856 1.1 mrg ref = DR_REF (root->ref); 1857 1.1 mrg 1858 1.1 mrg for (i = 0; i < n + (reuse_first ? 0 : 1); i++) 1859 1.1 mrg { 1860 1.1 mrg var = predcom_tmp_var (ref, i, tmp_vars); 1861 1.1 mrg chain->vars.quick_push (var); 1862 1.1 mrg } 1863 1.1 mrg if (reuse_first) 1864 1.1 mrg chain->vars.quick_push (chain->vars[0]); 1865 1.1 mrg 1866 1.1 mrg FOR_EACH_VEC_ELT (chain->vars, i, var) 1867 1.1 mrg chain->vars[i] = make_ssa_name (var); 1868 1.1 mrg 1869 1.1 mrg for (i = 0; i < n; i++) 1870 1.1 mrg { 1871 1.1 mrg var = chain->vars[i]; 1872 1.1 mrg next = chain->vars[i + 1]; 1873 1.1 mrg init = get_init_expr (chain, i); 1874 1.1 mrg 1875 1.1 mrg init = force_gimple_operand (init, &stmts, true, NULL_TREE); 1876 1.1 mrg if (stmts) 1877 1.1 mrg gsi_insert_seq_on_edge_immediate (entry, stmts); 1878 1.1 mrg 1879 1.1 mrg phi = create_phi_node (var, loop->header); 1880 1.1 mrg add_phi_arg (phi, init, entry, UNKNOWN_LOCATION); 1881 1.1 mrg add_phi_arg (phi, next, latch, UNKNOWN_LOCATION); 1882 1.1 mrg } 1883 1.1 mrg } 1884 1.1 mrg 1885 1.1 mrg /* For inter-iteration store elimination CHAIN in LOOP, returns true if 1886 1.1 mrg all stores to be eliminated store loop invariant values into memory. 1887 1.1 mrg In this case, we can use these invariant values directly after LOOP. */ 1888 1.1 mrg 1889 1.1 mrg static bool 1890 1.1 mrg is_inv_store_elimination_chain (class loop *loop, chain_p chain) 1891 1.1 mrg { 1892 1.1 mrg if (chain->length == 0 || chain->type != CT_STORE_STORE) 1893 1.1 mrg return false; 1894 1.1 mrg 1895 1.1 mrg gcc_assert (!chain->has_max_use_after); 1896 1.1 mrg 1897 1.1 mrg /* If loop iterates for unknown times or fewer times than chain->length, 1898 1.1 mrg we still need to setup root variable and propagate it with PHI node. */ 1899 1.1 mrg tree niters = number_of_latch_executions (loop); 1900 1.1 mrg if (TREE_CODE (niters) != INTEGER_CST 1901 1.1 mrg || wi::leu_p (wi::to_wide (niters), chain->length)) 1902 1.1 mrg return false; 1903 1.1 mrg 1904 1.1 mrg /* Check stores in chain for elimination if they only store loop invariant 1905 1.1 mrg values. */ 1906 1.1 mrg for (unsigned i = 0; i < chain->length; i++) 1907 1.1 mrg { 1908 1.1 mrg dref a = get_chain_last_write_at (chain, i); 1909 1.1 mrg if (a == NULL) 1910 1.1 mrg continue; 1911 1.1 mrg 1912 1.1 mrg gimple *def_stmt, *stmt = a->stmt; 1913 1.1 mrg if (!gimple_assign_single_p (stmt)) 1914 1.1 mrg return false; 1915 1.1 mrg 1916 1.1 mrg tree val = gimple_assign_rhs1 (stmt); 1917 1.1 mrg if (TREE_CLOBBER_P (val)) 1918 1.1 mrg return false; 1919 1.1 mrg 1920 1.1 mrg if (CONSTANT_CLASS_P (val)) 1921 1.1 mrg continue; 1922 1.1 mrg 1923 1.1 mrg if (TREE_CODE (val) != SSA_NAME) 1924 1.1 mrg return false; 1925 1.1 mrg 1926 1.1 mrg def_stmt = SSA_NAME_DEF_STMT (val); 1927 1.1 mrg if (gimple_nop_p (def_stmt)) 1928 1.1 mrg continue; 1929 1.1 mrg 1930 1.1 mrg if (flow_bb_inside_loop_p (loop, gimple_bb (def_stmt))) 1931 1.1 mrg return false; 1932 1.1 mrg } 1933 1.1 mrg return true; 1934 1.1 mrg } 1935 1.1 mrg 1936 1.1 mrg /* Creates root variables for store elimination CHAIN in which stores for 1937 1.1 mrg elimination only store loop invariant values. In this case, we neither 1938 1.1 mrg need to load root variables before loop nor propagate it with PHI nodes. */ 1939 1.1 mrg 1940 1.1 mrg static void 1941 1.1 mrg initialize_root_vars_store_elim_1 (chain_p chain) 1942 1.1 mrg { 1943 1.1 mrg tree var; 1944 1.1 mrg unsigned i, n = chain->length; 1945 1.1 mrg 1946 1.1 mrg chain->vars.create (n); 1947 1.1 mrg chain->vars.safe_grow_cleared (n, true); 1948 1.1 mrg 1949 1.1 mrg /* Initialize root value for eliminated stores at each distance. */ 1950 1.1 mrg for (i = 0; i < n; i++) 1951 1.1 mrg { 1952 1.1 mrg dref a = get_chain_last_write_at (chain, i); 1953 1.1 mrg if (a == NULL) 1954 1.1 mrg continue; 1955 1.1 mrg 1956 1.1 mrg var = gimple_assign_rhs1 (a->stmt); 1957 1.1 mrg chain->vars[a->distance] = var; 1958 1.1 mrg } 1959 1.1 mrg 1960 1.1 mrg /* We don't propagate values with PHI nodes, so manually propagate value 1961 1.1 mrg to bubble positions. */ 1962 1.1 mrg var = chain->vars[0]; 1963 1.1 mrg for (i = 1; i < n; i++) 1964 1.1 mrg { 1965 1.1 mrg if (chain->vars[i] != NULL_TREE) 1966 1.1 mrg { 1967 1.1 mrg var = chain->vars[i]; 1968 1.1 mrg continue; 1969 1.1 mrg } 1970 1.1 mrg chain->vars[i] = var; 1971 1.1 mrg } 1972 1.1 mrg 1973 1.1 mrg /* Revert the vector. */ 1974 1.1 mrg for (i = 0; i < n / 2; i++) 1975 1.1 mrg std::swap (chain->vars[i], chain->vars[n - i - 1]); 1976 1.1 mrg } 1977 1.1 mrg 1978 1.1 mrg /* Creates root variables for store elimination CHAIN in which stores for 1979 1.1 mrg elimination store loop variant values. In this case, we may need to 1980 1.1 mrg load root variables before LOOP and propagate it with PHI nodes. Uids 1981 1.1 mrg of the newly created root variables are marked in TMP_VARS. */ 1982 1.1 mrg 1983 1.1 mrg static void 1984 1.1 mrg initialize_root_vars_store_elim_2 (class loop *loop, 1985 1.1 mrg chain_p chain, bitmap tmp_vars) 1986 1.1 mrg { 1987 1.1 mrg unsigned i, n = chain->length; 1988 1.1 mrg tree ref, init, var, next, val, phi_result; 1989 1.1 mrg gimple *stmt; 1990 1.1 mrg gimple_seq stmts; 1991 1.1 mrg 1992 1.1 mrg chain->vars.create (n); 1993 1.1 mrg 1994 1.1 mrg ref = DR_REF (get_chain_root (chain)->ref); 1995 1.1 mrg for (i = 0; i < n; i++) 1996 1.1 mrg { 1997 1.1 mrg var = predcom_tmp_var (ref, i, tmp_vars); 1998 1.1 mrg chain->vars.quick_push (var); 1999 1.1 mrg } 2000 1.1 mrg 2001 1.1 mrg FOR_EACH_VEC_ELT (chain->vars, i, var) 2002 1.1 mrg chain->vars[i] = make_ssa_name (var); 2003 1.1 mrg 2004 1.1 mrg /* Root values are either rhs operand of stores to be eliminated, or 2005 1.1 mrg loaded from memory before loop. */ 2006 1.1 mrg auto_vec<tree> vtemps; 2007 1.1 mrg vtemps.safe_grow_cleared (n, true); 2008 1.1 mrg for (i = 0; i < n; i++) 2009 1.1 mrg { 2010 1.1 mrg init = get_init_expr (chain, i); 2011 1.1 mrg if (init == NULL_TREE) 2012 1.1 mrg { 2013 1.1 mrg /* Root value is rhs operand of the store to be eliminated if 2014 1.1 mrg it isn't loaded from memory before loop. */ 2015 1.1 mrg dref a = get_chain_last_write_at (chain, i); 2016 1.1 mrg val = gimple_assign_rhs1 (a->stmt); 2017 1.1 mrg if (TREE_CLOBBER_P (val)) 2018 1.1 mrg { 2019 1.1 mrg val = get_or_create_ssa_default_def (cfun, SSA_NAME_VAR (var)); 2020 1.1 mrg gimple_assign_set_rhs1 (a->stmt, val); 2021 1.1 mrg } 2022 1.1 mrg 2023 1.1 mrg vtemps[n - i - 1] = val; 2024 1.1 mrg } 2025 1.1 mrg else 2026 1.1 mrg { 2027 1.1 mrg edge latch = loop_latch_edge (loop); 2028 1.1 mrg edge entry = loop_preheader_edge (loop); 2029 1.1 mrg 2030 1.1 mrg /* Root value is loaded from memory before loop, we also need 2031 1.1 mrg to add PHI nodes to propagate the value across iterations. */ 2032 1.1 mrg init = force_gimple_operand (init, &stmts, true, NULL_TREE); 2033 1.1 mrg if (stmts) 2034 1.1 mrg gsi_insert_seq_on_edge_immediate (entry, stmts); 2035 1.1 mrg 2036 1.1 mrg next = chain->vars[n - i]; 2037 1.1 mrg phi_result = copy_ssa_name (next); 2038 1.1 mrg gphi *phi = create_phi_node (phi_result, loop->header); 2039 1.1 mrg add_phi_arg (phi, init, entry, UNKNOWN_LOCATION); 2040 1.1 mrg add_phi_arg (phi, next, latch, UNKNOWN_LOCATION); 2041 1.1 mrg vtemps[n - i - 1] = phi_result; 2042 1.1 mrg } 2043 1.1 mrg } 2044 1.1 mrg 2045 1.1 mrg /* Find the insertion position. */ 2046 1.1 mrg dref last = get_chain_root (chain); 2047 1.1 mrg for (i = 0; i < chain->refs.length (); i++) 2048 1.1 mrg { 2049 1.1 mrg if (chain->refs[i]->pos > last->pos) 2050 1.1 mrg last = chain->refs[i]; 2051 1.1 mrg } 2052 1.1 mrg 2053 1.1 mrg gimple_stmt_iterator gsi = gsi_for_stmt (last->stmt); 2054 1.1 mrg 2055 1.1 mrg /* Insert statements copying root value to root variable. */ 2056 1.1 mrg for (i = 0; i < n; i++) 2057 1.1 mrg { 2058 1.1 mrg var = chain->vars[i]; 2059 1.1 mrg val = vtemps[i]; 2060 1.1 mrg stmt = gimple_build_assign (var, val); 2061 1.1 mrg gsi_insert_after (&gsi, stmt, GSI_NEW_STMT); 2062 1.1 mrg } 2063 1.1 mrg } 2064 1.1 mrg 2065 1.1 mrg /* Generates stores for CHAIN's eliminated stores in LOOP's last 2066 1.1 mrg (CHAIN->length - 1) iterations. */ 2067 1.1 mrg 2068 1.1 mrg static void 2069 1.1 mrg finalize_eliminated_stores (class loop *loop, chain_p chain) 2070 1.1 mrg { 2071 1.1 mrg unsigned i, n = chain->length; 2072 1.1 mrg 2073 1.1 mrg for (i = 0; i < n; i++) 2074 1.1 mrg { 2075 1.1 mrg tree var = chain->vars[i]; 2076 1.1 mrg tree fini = chain->finis[n - i - 1]; 2077 1.1 mrg gimple *stmt = gimple_build_assign (fini, var); 2078 1.1 mrg 2079 1.1 mrg gimple_seq_add_stmt_without_update (&chain->fini_seq, stmt); 2080 1.1 mrg } 2081 1.1 mrg 2082 1.1 mrg if (chain->fini_seq) 2083 1.1 mrg { 2084 1.1 mrg gsi_insert_seq_on_edge_immediate (single_exit (loop), chain->fini_seq); 2085 1.1 mrg chain->fini_seq = NULL; 2086 1.1 mrg } 2087 1.1 mrg } 2088 1.1 mrg 2089 1.1 mrg /* Initializes a variable for load motion for ROOT and prepares phi nodes and 2090 1.1 mrg initialization on entry to LOOP if necessary. The ssa name for the variable 2091 1.1 mrg is stored in VARS. If WRITTEN is true, also a phi node to copy its value 2092 1.1 mrg around the loop is created. Uid of the newly created temporary variable 2093 1.1 mrg is marked in TMP_VARS. INITS is the list containing the (single) 2094 1.1 mrg initializer. */ 2095 1.1 mrg 2096 1.1 mrg static void 2097 1.1 mrg initialize_root_vars_lm (class loop *loop, dref root, bool written, 2098 1.1 mrg vec<tree> *vars, const vec<tree> &inits, 2099 1.1 mrg bitmap tmp_vars) 2100 1.1 mrg { 2101 1.1 mrg unsigned i; 2102 1.1 mrg tree ref = DR_REF (root->ref), init, var, next; 2103 1.1 mrg gimple_seq stmts; 2104 1.1 mrg gphi *phi; 2105 1.1 mrg edge entry = loop_preheader_edge (loop), latch = loop_latch_edge (loop); 2106 1.1 mrg 2107 1.1 mrg /* Find the initializer for the variable, and check that it cannot 2108 1.1 mrg trap. */ 2109 1.1 mrg init = inits[0]; 2110 1.1 mrg 2111 1.1 mrg vars->create (written ? 2 : 1); 2112 1.1 mrg var = predcom_tmp_var (ref, 0, tmp_vars); 2113 1.1 mrg vars->quick_push (var); 2114 1.1 mrg if (written) 2115 1.1 mrg vars->quick_push ((*vars)[0]); 2116 1.1 mrg 2117 1.1 mrg FOR_EACH_VEC_ELT (*vars, i, var) 2118 1.1 mrg (*vars)[i] = make_ssa_name (var); 2119 1.1 mrg 2120 1.1 mrg var = (*vars)[0]; 2121 1.1 mrg 2122 1.1 mrg init = force_gimple_operand (init, &stmts, written, NULL_TREE); 2123 1.1 mrg if (stmts) 2124 1.1 mrg gsi_insert_seq_on_edge_immediate (entry, stmts); 2125 1.1 mrg 2126 1.1 mrg if (written) 2127 1.1 mrg { 2128 1.1 mrg next = (*vars)[1]; 2129 1.1 mrg phi = create_phi_node (var, loop->header); 2130 1.1 mrg add_phi_arg (phi, init, entry, UNKNOWN_LOCATION); 2131 1.1 mrg add_phi_arg (phi, next, latch, UNKNOWN_LOCATION); 2132 1.1 mrg } 2133 1.1 mrg else 2134 1.1 mrg { 2135 1.1 mrg gassign *init_stmt = gimple_build_assign (var, init); 2136 1.1 mrg gsi_insert_on_edge_immediate (entry, init_stmt); 2137 1.1 mrg } 2138 1.1 mrg } 2139 1.1 mrg 2140 1.1 mrg 2141 1.1 mrg /* Execute load motion for references in chain CHAIN. Uids of the newly 2142 1.1 mrg created temporary variables are marked in TMP_VARS. */ 2143 1.1 mrg 2144 1.1 mrg static void 2145 1.1 mrg execute_load_motion (class loop *loop, chain_p chain, bitmap tmp_vars) 2146 1.1 mrg { 2147 1.1 mrg auto_vec<tree> vars; 2148 1.1 mrg dref a; 2149 1.1 mrg unsigned n_writes = 0, ridx, i; 2150 1.1 mrg tree var; 2151 1.1 mrg 2152 1.1 mrg gcc_assert (chain->type == CT_INVARIANT); 2153 1.1 mrg gcc_assert (!chain->combined); 2154 1.1 mrg FOR_EACH_VEC_ELT (chain->refs, i, a) 2155 1.1 mrg if (DR_IS_WRITE (a->ref)) 2156 1.1 mrg n_writes++; 2157 1.1 mrg 2158 1.1 mrg /* If there are no reads in the loop, there is nothing to do. */ 2159 1.1 mrg if (n_writes == chain->refs.length ()) 2160 1.1 mrg return; 2161 1.1 mrg 2162 1.1 mrg initialize_root_vars_lm (loop, get_chain_root (chain), n_writes > 0, 2163 1.1 mrg &vars, chain->inits, tmp_vars); 2164 1.1 mrg 2165 1.1 mrg ridx = 0; 2166 1.1 mrg FOR_EACH_VEC_ELT (chain->refs, i, a) 2167 1.1 mrg { 2168 1.1 mrg bool is_read = DR_IS_READ (a->ref); 2169 1.1 mrg 2170 1.1 mrg if (DR_IS_WRITE (a->ref)) 2171 1.1 mrg { 2172 1.1 mrg n_writes--; 2173 1.1 mrg if (n_writes) 2174 1.1 mrg { 2175 1.1 mrg var = vars[0]; 2176 1.1 mrg var = make_ssa_name (SSA_NAME_VAR (var)); 2177 1.1 mrg vars[0] = var; 2178 1.1 mrg } 2179 1.1 mrg else 2180 1.1 mrg ridx = 1; 2181 1.1 mrg } 2182 1.1 mrg 2183 1.1 mrg replace_ref_with (a->stmt, vars[ridx], 2184 1.1 mrg !is_read, !is_read); 2185 1.1 mrg } 2186 1.1 mrg } 2187 1.1 mrg 2188 1.1 mrg /* Returns the single statement in that NAME is used, excepting 2189 1.1 mrg the looparound phi nodes contained in one of the chains. If there is no 2190 1.1 mrg such statement, or more statements, NULL is returned. */ 2191 1.1 mrg 2192 1.1 mrg gimple * 2193 1.1 mrg pcom_worker::single_nonlooparound_use (tree name) 2194 1.1 mrg { 2195 1.1 mrg use_operand_p use; 2196 1.1 mrg imm_use_iterator it; 2197 1.1 mrg gimple *stmt, *ret = NULL; 2198 1.1 mrg 2199 1.1 mrg FOR_EACH_IMM_USE_FAST (use, it, name) 2200 1.1 mrg { 2201 1.1 mrg stmt = USE_STMT (use); 2202 1.1 mrg 2203 1.1 mrg if (gimple_code (stmt) == GIMPLE_PHI) 2204 1.1 mrg { 2205 1.1 mrg /* Ignore uses in looparound phi nodes. Uses in other phi nodes 2206 1.1 mrg could not be processed anyway, so just fail for them. */ 2207 1.1 mrg if (bitmap_bit_p (m_looparound_phis, 2208 1.1 mrg SSA_NAME_VERSION (PHI_RESULT (stmt)))) 2209 1.1 mrg continue; 2210 1.1 mrg 2211 1.1 mrg return NULL; 2212 1.1 mrg } 2213 1.1 mrg else if (is_gimple_debug (stmt)) 2214 1.1 mrg continue; 2215 1.1 mrg else if (ret != NULL) 2216 1.1 mrg return NULL; 2217 1.1 mrg else 2218 1.1 mrg ret = stmt; 2219 1.1 mrg } 2220 1.1 mrg 2221 1.1 mrg return ret; 2222 1.1 mrg } 2223 1.1 mrg 2224 1.1 mrg /* Remove statement STMT, as well as the chain of assignments in that it is 2225 1.1 mrg used. */ 2226 1.1 mrg 2227 1.1 mrg void 2228 1.1 mrg pcom_worker::remove_stmt (gimple *stmt) 2229 1.1 mrg { 2230 1.1 mrg tree name; 2231 1.1 mrg gimple *next; 2232 1.1 mrg gimple_stmt_iterator psi; 2233 1.1 mrg 2234 1.1 mrg if (gimple_code (stmt) == GIMPLE_PHI) 2235 1.1 mrg { 2236 1.1 mrg name = PHI_RESULT (stmt); 2237 1.1 mrg next = single_nonlooparound_use (name); 2238 1.1 mrg reset_debug_uses (stmt); 2239 1.1 mrg psi = gsi_for_stmt (stmt); 2240 1.1 mrg remove_phi_node (&psi, true); 2241 1.1 mrg 2242 1.1 mrg if (!next 2243 1.1 mrg || !gimple_assign_ssa_name_copy_p (next) 2244 1.1 mrg || gimple_assign_rhs1 (next) != name) 2245 1.1 mrg return; 2246 1.1 mrg 2247 1.1 mrg stmt = next; 2248 1.1 mrg } 2249 1.1 mrg 2250 1.1 mrg while (1) 2251 1.1 mrg { 2252 1.1 mrg gimple_stmt_iterator bsi; 2253 1.1 mrg 2254 1.1 mrg bsi = gsi_for_stmt (stmt); 2255 1.1 mrg 2256 1.1 mrg name = gimple_assign_lhs (stmt); 2257 1.1 mrg if (TREE_CODE (name) == SSA_NAME) 2258 1.1 mrg { 2259 1.1 mrg next = single_nonlooparound_use (name); 2260 1.1 mrg reset_debug_uses (stmt); 2261 1.1 mrg } 2262 1.1 mrg else 2263 1.1 mrg { 2264 1.1 mrg /* This is a store to be eliminated. */ 2265 1.1 mrg gcc_assert (gimple_vdef (stmt) != NULL); 2266 1.1 mrg next = NULL; 2267 1.1 mrg } 2268 1.1 mrg 2269 1.1 mrg unlink_stmt_vdef (stmt); 2270 1.1 mrg gsi_remove (&bsi, true); 2271 1.1 mrg release_defs (stmt); 2272 1.1 mrg 2273 1.1 mrg if (!next 2274 1.1 mrg || !gimple_assign_ssa_name_copy_p (next) 2275 1.1 mrg || gimple_assign_rhs1 (next) != name) 2276 1.1 mrg return; 2277 1.1 mrg 2278 1.1 mrg stmt = next; 2279 1.1 mrg } 2280 1.1 mrg } 2281 1.1 mrg 2282 1.1 mrg /* Perform the predictive commoning optimization for a chain CHAIN. 2283 1.1 mrg Uids of the newly created temporary variables are marked in TMP_VARS.*/ 2284 1.1 mrg 2285 1.1 mrg void 2286 1.1 mrg pcom_worker::execute_pred_commoning_chain (chain_p chain, 2287 1.1 mrg bitmap tmp_vars) 2288 1.1 mrg { 2289 1.1 mrg unsigned i; 2290 1.1 mrg dref a; 2291 1.1 mrg tree var; 2292 1.1 mrg bool in_lhs; 2293 1.1 mrg 2294 1.1 mrg if (chain->combined) 2295 1.1 mrg { 2296 1.1 mrg /* For combined chains, just remove the statements that are used to 2297 1.1 mrg compute the values of the expression (except for the root one). 2298 1.1 mrg We delay this until after all chains are processed. */ 2299 1.1 mrg } 2300 1.1 mrg else if (chain->type == CT_STORE_STORE) 2301 1.1 mrg { 2302 1.1 mrg if (chain->length > 0) 2303 1.1 mrg { 2304 1.1 mrg if (chain->inv_store_elimination) 2305 1.1 mrg { 2306 1.1 mrg /* If dead stores in this chain only store loop invariant 2307 1.1 mrg values, we can simply record the invariant value and use 2308 1.1 mrg it directly after loop. */ 2309 1.1 mrg initialize_root_vars_store_elim_1 (chain); 2310 1.1 mrg } 2311 1.1 mrg else 2312 1.1 mrg { 2313 1.1 mrg /* If dead stores in this chain store loop variant values, 2314 1.1 mrg we need to set up the variables by loading from memory 2315 1.1 mrg before loop and propagating it with PHI nodes. */ 2316 1.1 mrg initialize_root_vars_store_elim_2 (m_loop, chain, tmp_vars); 2317 1.1 mrg } 2318 1.1 mrg 2319 1.1 mrg /* For inter-iteration store elimination chain, stores at each 2320 1.1 mrg distance in loop's last (chain->length - 1) iterations can't 2321 1.1 mrg be eliminated, because there is no following killing store. 2322 1.1 mrg We need to generate these stores after loop. */ 2323 1.1 mrg finalize_eliminated_stores (m_loop, chain); 2324 1.1 mrg } 2325 1.1 mrg 2326 1.1 mrg bool last_store_p = true; 2327 1.1 mrg for (i = chain->refs.length (); i > 0; i--) 2328 1.1 mrg { 2329 1.1 mrg a = chain->refs[i - 1]; 2330 1.1 mrg /* Preserve the last store of the chain. Eliminate other stores 2331 1.1 mrg which are killed by the last one. */ 2332 1.1 mrg if (DR_IS_WRITE (a->ref)) 2333 1.1 mrg { 2334 1.1 mrg if (last_store_p) 2335 1.1 mrg last_store_p = false; 2336 1.1 mrg else 2337 1.1 mrg remove_stmt (a->stmt); 2338 1.1 mrg 2339 1.1 mrg continue; 2340 1.1 mrg } 2341 1.1 mrg 2342 1.1 mrg /* Any load in Store-Store chain must be dominated by a previous 2343 1.1 mrg store, we replace the load reference with rhs of the store. */ 2344 1.1 mrg dref b = get_chain_last_write_before_load (chain, i - 1); 2345 1.1 mrg gcc_assert (b != NULL); 2346 1.1 mrg var = gimple_assign_rhs1 (b->stmt); 2347 1.1 mrg replace_ref_with (a->stmt, var, false, false); 2348 1.1 mrg } 2349 1.1 mrg } 2350 1.1 mrg else 2351 1.1 mrg { 2352 1.1 mrg /* For non-combined chains, set up the variables that hold its value. */ 2353 1.1 mrg initialize_root_vars (m_loop, chain, tmp_vars); 2354 1.1 mrg a = get_chain_root (chain); 2355 1.1 mrg in_lhs = (chain->type == CT_STORE_LOAD 2356 1.1 mrg || chain->type == CT_COMBINATION); 2357 1.1 mrg replace_ref_with (a->stmt, chain->vars[chain->length], true, in_lhs); 2358 1.1 mrg 2359 1.1 mrg /* Replace the uses of the original references by these variables. */ 2360 1.1 mrg for (i = 1; chain->refs.iterate (i, &a); i++) 2361 1.1 mrg { 2362 1.1 mrg var = chain->vars[chain->length - a->distance]; 2363 1.1 mrg replace_ref_with (a->stmt, var, false, false); 2364 1.1 mrg } 2365 1.1 mrg } 2366 1.1 mrg } 2367 1.1 mrg 2368 1.1 mrg /* Determines the unroll factor necessary to remove as many temporary variable 2369 1.1 mrg copies as possible. CHAINS is the list of chains that will be 2370 1.1 mrg optimized. */ 2371 1.1 mrg 2372 1.1 mrg static unsigned 2373 1.1 mrg determine_unroll_factor (const vec<chain_p> &chains) 2374 1.1 mrg { 2375 1.1 mrg chain_p chain; 2376 1.1 mrg unsigned factor = 1, af, nfactor, i; 2377 1.1 mrg unsigned max = param_max_unroll_times; 2378 1.1 mrg 2379 1.1 mrg FOR_EACH_VEC_ELT (chains, i, chain) 2380 1.1 mrg { 2381 1.1 mrg if (chain->type == CT_INVARIANT) 2382 1.1 mrg continue; 2383 1.1 mrg /* For now we can't handle unrolling when eliminating stores. */ 2384 1.1 mrg else if (chain->type == CT_STORE_STORE) 2385 1.1 mrg return 1; 2386 1.1 mrg 2387 1.1 mrg if (chain->combined) 2388 1.1 mrg { 2389 1.1 mrg /* For combined chains, we can't handle unrolling if we replace 2390 1.1 mrg looparound PHIs. */ 2391 1.1 mrg dref a; 2392 1.1 mrg unsigned j; 2393 1.1 mrg for (j = 1; chain->refs.iterate (j, &a); j++) 2394 1.1 mrg if (gimple_code (a->stmt) == GIMPLE_PHI) 2395 1.1 mrg return 1; 2396 1.1 mrg continue; 2397 1.1 mrg } 2398 1.1 mrg 2399 1.1 mrg /* The best unroll factor for this chain is equal to the number of 2400 1.1 mrg temporary variables that we create for it. */ 2401 1.1 mrg af = chain->length; 2402 1.1 mrg if (chain->has_max_use_after) 2403 1.1 mrg af++; 2404 1.1 mrg 2405 1.1 mrg nfactor = factor * af / gcd (factor, af); 2406 1.1 mrg if (nfactor <= max) 2407 1.1 mrg factor = nfactor; 2408 1.1 mrg } 2409 1.1 mrg 2410 1.1 mrg return factor; 2411 1.1 mrg } 2412 1.1 mrg 2413 1.1 mrg /* Perform the predictive commoning optimization for chains. 2414 1.1 mrg Uids of the newly created temporary variables are marked in TMP_VARS. */ 2415 1.1 mrg 2416 1.1 mrg void 2417 1.1 mrg pcom_worker::execute_pred_commoning (bitmap tmp_vars) 2418 1.1 mrg { 2419 1.1 mrg chain_p chain; 2420 1.1 mrg unsigned i; 2421 1.1 mrg 2422 1.1 mrg FOR_EACH_VEC_ELT (m_chains, i, chain) 2423 1.1 mrg { 2424 1.1 mrg if (chain->type == CT_INVARIANT) 2425 1.1 mrg execute_load_motion (m_loop, chain, tmp_vars); 2426 1.1 mrg else 2427 1.1 mrg execute_pred_commoning_chain (chain, tmp_vars); 2428 1.1 mrg } 2429 1.1 mrg 2430 1.1 mrg FOR_EACH_VEC_ELT (m_chains, i, chain) 2431 1.1 mrg { 2432 1.1 mrg if (chain->type == CT_INVARIANT) 2433 1.1 mrg ; 2434 1.1 mrg else if (chain->combined) 2435 1.1 mrg { 2436 1.1 mrg /* For combined chains, just remove the statements that are used to 2437 1.1 mrg compute the values of the expression (except for the root one). */ 2438 1.1 mrg dref a; 2439 1.1 mrg unsigned j; 2440 1.1 mrg for (j = 1; chain->refs.iterate (j, &a); j++) 2441 1.1 mrg remove_stmt (a->stmt); 2442 1.1 mrg } 2443 1.1 mrg } 2444 1.1 mrg } 2445 1.1 mrg 2446 1.1 mrg /* For each reference in CHAINS, if its defining statement is 2447 1.1 mrg phi node, record the ssa name that is defined by it. */ 2448 1.1 mrg 2449 1.1 mrg static void 2450 1.1 mrg replace_phis_by_defined_names (vec<chain_p> &chains) 2451 1.1 mrg { 2452 1.1 mrg chain_p chain; 2453 1.1 mrg dref a; 2454 1.1 mrg unsigned i, j; 2455 1.1 mrg 2456 1.1 mrg FOR_EACH_VEC_ELT (chains, i, chain) 2457 1.1 mrg FOR_EACH_VEC_ELT (chain->refs, j, a) 2458 1.1 mrg { 2459 1.1 mrg if (gimple_code (a->stmt) == GIMPLE_PHI) 2460 1.1 mrg { 2461 1.1 mrg a->name_defined_by_phi = PHI_RESULT (a->stmt); 2462 1.1 mrg a->stmt = NULL; 2463 1.1 mrg } 2464 1.1 mrg } 2465 1.1 mrg } 2466 1.1 mrg 2467 1.1 mrg /* For each reference in CHAINS, if name_defined_by_phi is not 2468 1.1 mrg NULL, use it to set the stmt field. */ 2469 1.1 mrg 2470 1.1 mrg static void 2471 1.1 mrg replace_names_by_phis (vec<chain_p> chains) 2472 1.1 mrg { 2473 1.1 mrg chain_p chain; 2474 1.1 mrg dref a; 2475 1.1 mrg unsigned i, j; 2476 1.1 mrg 2477 1.1 mrg FOR_EACH_VEC_ELT (chains, i, chain) 2478 1.1 mrg FOR_EACH_VEC_ELT (chain->refs, j, a) 2479 1.1 mrg if (a->stmt == NULL) 2480 1.1 mrg { 2481 1.1 mrg a->stmt = SSA_NAME_DEF_STMT (a->name_defined_by_phi); 2482 1.1 mrg gcc_assert (gimple_code (a->stmt) == GIMPLE_PHI); 2483 1.1 mrg a->name_defined_by_phi = NULL_TREE; 2484 1.1 mrg } 2485 1.1 mrg } 2486 1.1 mrg 2487 1.1 mrg /* Wrapper over execute_pred_commoning, to pass it as a callback 2488 1.1 mrg to tree_transform_and_unroll_loop. */ 2489 1.1 mrg 2490 1.1 mrg struct epcc_data 2491 1.1 mrg { 2492 1.1 mrg vec<chain_p> chains; 2493 1.1 mrg bitmap tmp_vars; 2494 1.1 mrg pcom_worker *worker; 2495 1.1 mrg }; 2496 1.1 mrg 2497 1.1 mrg static void 2498 1.1 mrg execute_pred_commoning_cbck (class loop *loop ATTRIBUTE_UNUSED, void *data) 2499 1.1 mrg { 2500 1.1 mrg struct epcc_data *const dta = (struct epcc_data *) data; 2501 1.1 mrg pcom_worker *worker = dta->worker; 2502 1.1 mrg 2503 1.1 mrg /* Restore phi nodes that were replaced by ssa names before 2504 1.1 mrg tree_transform_and_unroll_loop (see detailed description in 2505 1.1 mrg tree_predictive_commoning_loop). */ 2506 1.1 mrg replace_names_by_phis (dta->chains); 2507 1.1 mrg worker->execute_pred_commoning (dta->tmp_vars); 2508 1.1 mrg } 2509 1.1 mrg 2510 1.1 mrg /* Base NAME and all the names in the chain of phi nodes that use it 2511 1.1 mrg on variable VAR. The phi nodes are recognized by being in the copies of 2512 1.1 mrg the header of the LOOP. */ 2513 1.1 mrg 2514 1.1 mrg static void 2515 1.1 mrg base_names_in_chain_on (class loop *loop, tree name, tree var) 2516 1.1 mrg { 2517 1.1 mrg gimple *stmt, *phi; 2518 1.1 mrg imm_use_iterator iter; 2519 1.1 mrg 2520 1.1 mrg replace_ssa_name_symbol (name, var); 2521 1.1 mrg 2522 1.1 mrg while (1) 2523 1.1 mrg { 2524 1.1 mrg phi = NULL; 2525 1.1 mrg FOR_EACH_IMM_USE_STMT (stmt, iter, name) 2526 1.1 mrg { 2527 1.1 mrg if (gimple_code (stmt) == GIMPLE_PHI 2528 1.1 mrg && flow_bb_inside_loop_p (loop, gimple_bb (stmt))) 2529 1.1 mrg { 2530 1.1 mrg phi = stmt; 2531 1.1 mrg break; 2532 1.1 mrg } 2533 1.1 mrg } 2534 1.1 mrg if (!phi) 2535 1.1 mrg return; 2536 1.1 mrg 2537 1.1 mrg name = PHI_RESULT (phi); 2538 1.1 mrg replace_ssa_name_symbol (name, var); 2539 1.1 mrg } 2540 1.1 mrg } 2541 1.1 mrg 2542 1.1 mrg /* Given an unrolled LOOP after predictive commoning, remove the 2543 1.1 mrg register copies arising from phi nodes by changing the base 2544 1.1 mrg variables of SSA names. TMP_VARS is the set of the temporary variables 2545 1.1 mrg for those we want to perform this. */ 2546 1.1 mrg 2547 1.1 mrg static void 2548 1.1 mrg eliminate_temp_copies (class loop *loop, bitmap tmp_vars) 2549 1.1 mrg { 2550 1.1 mrg edge e; 2551 1.1 mrg gphi *phi; 2552 1.1 mrg gimple *stmt; 2553 1.1 mrg tree name, use, var; 2554 1.1 mrg gphi_iterator psi; 2555 1.1 mrg 2556 1.1 mrg e = loop_latch_edge (loop); 2557 1.1 mrg for (psi = gsi_start_phis (loop->header); !gsi_end_p (psi); gsi_next (&psi)) 2558 1.1 mrg { 2559 1.1 mrg phi = psi.phi (); 2560 1.1 mrg name = PHI_RESULT (phi); 2561 1.1 mrg var = SSA_NAME_VAR (name); 2562 1.1 mrg if (!var || !bitmap_bit_p (tmp_vars, DECL_UID (var))) 2563 1.1 mrg continue; 2564 1.1 mrg use = PHI_ARG_DEF_FROM_EDGE (phi, e); 2565 1.1 mrg gcc_assert (TREE_CODE (use) == SSA_NAME); 2566 1.1 mrg 2567 1.1 mrg /* Base all the ssa names in the ud and du chain of NAME on VAR. */ 2568 1.1 mrg stmt = SSA_NAME_DEF_STMT (use); 2569 1.1 mrg while (gimple_code (stmt) == GIMPLE_PHI 2570 1.1 mrg /* In case we could not unroll the loop enough to eliminate 2571 1.1 mrg all copies, we may reach the loop header before the defining 2572 1.1 mrg statement (in that case, some register copies will be present 2573 1.1 mrg in loop latch in the final code, corresponding to the newly 2574 1.1 mrg created looparound phi nodes). */ 2575 1.1 mrg && gimple_bb (stmt) != loop->header) 2576 1.1 mrg { 2577 1.1 mrg gcc_assert (single_pred_p (gimple_bb (stmt))); 2578 1.1 mrg use = PHI_ARG_DEF (stmt, 0); 2579 1.1 mrg stmt = SSA_NAME_DEF_STMT (use); 2580 1.1 mrg } 2581 1.1 mrg 2582 1.1 mrg base_names_in_chain_on (loop, use, var); 2583 1.1 mrg } 2584 1.1 mrg } 2585 1.1 mrg 2586 1.1 mrg /* Returns true if CHAIN is suitable to be combined. */ 2587 1.1 mrg 2588 1.1 mrg static bool 2589 1.1 mrg chain_can_be_combined_p (chain_p chain) 2590 1.1 mrg { 2591 1.1 mrg return (!chain->combined 2592 1.1 mrg && (chain->type == CT_LOAD || chain->type == CT_COMBINATION)); 2593 1.1 mrg } 2594 1.1 mrg 2595 1.1 mrg /* Returns the modify statement that uses NAME. Skips over assignment 2596 1.1 mrg statements, NAME is replaced with the actual name used in the returned 2597 1.1 mrg statement. */ 2598 1.1 mrg 2599 1.1 mrg gimple * 2600 1.1 mrg pcom_worker::find_use_stmt (tree *name) 2601 1.1 mrg { 2602 1.1 mrg gimple *stmt; 2603 1.1 mrg tree rhs, lhs; 2604 1.1 mrg 2605 1.1 mrg /* Skip over assignments. */ 2606 1.1 mrg while (1) 2607 1.1 mrg { 2608 1.1 mrg stmt = single_nonlooparound_use (*name); 2609 1.1 mrg if (!stmt) 2610 1.1 mrg return NULL; 2611 1.1 mrg 2612 1.1 mrg if (gimple_code (stmt) != GIMPLE_ASSIGN) 2613 1.1 mrg return NULL; 2614 1.1 mrg 2615 1.1 mrg lhs = gimple_assign_lhs (stmt); 2616 1.1 mrg if (TREE_CODE (lhs) != SSA_NAME) 2617 1.1 mrg return NULL; 2618 1.1 mrg 2619 1.1 mrg if (gimple_assign_copy_p (stmt)) 2620 1.1 mrg { 2621 1.1 mrg rhs = gimple_assign_rhs1 (stmt); 2622 1.1 mrg if (rhs != *name) 2623 1.1 mrg return NULL; 2624 1.1 mrg 2625 1.1 mrg *name = lhs; 2626 1.1 mrg } 2627 1.1 mrg else if (get_gimple_rhs_class (gimple_assign_rhs_code (stmt)) 2628 1.1 mrg == GIMPLE_BINARY_RHS) 2629 1.1 mrg return stmt; 2630 1.1 mrg else 2631 1.1 mrg return NULL; 2632 1.1 mrg } 2633 1.1 mrg } 2634 1.1 mrg 2635 1.1 mrg /* Returns true if we may perform reassociation for operation CODE in TYPE. */ 2636 1.1 mrg 2637 1.1 mrg static bool 2638 1.1 mrg may_reassociate_p (tree type, enum tree_code code) 2639 1.1 mrg { 2640 1.1 mrg if (FLOAT_TYPE_P (type) 2641 1.1 mrg && !flag_unsafe_math_optimizations) 2642 1.1 mrg return false; 2643 1.1 mrg 2644 1.1 mrg return (commutative_tree_code (code) 2645 1.1 mrg && associative_tree_code (code)); 2646 1.1 mrg } 2647 1.1 mrg 2648 1.1 mrg /* If the operation used in STMT is associative and commutative, go through the 2649 1.1 mrg tree of the same operations and returns its root. Distance to the root 2650 1.1 mrg is stored in DISTANCE. */ 2651 1.1 mrg 2652 1.1 mrg gimple * 2653 1.1 mrg pcom_worker::find_associative_operation_root (gimple *stmt, unsigned *distance) 2654 1.1 mrg { 2655 1.1 mrg tree lhs; 2656 1.1 mrg gimple *next; 2657 1.1 mrg enum tree_code code = gimple_assign_rhs_code (stmt); 2658 1.1 mrg tree type = TREE_TYPE (gimple_assign_lhs (stmt)); 2659 1.1 mrg unsigned dist = 0; 2660 1.1 mrg 2661 1.1 mrg if (!may_reassociate_p (type, code)) 2662 1.1 mrg return NULL; 2663 1.1 mrg 2664 1.1 mrg while (1) 2665 1.1 mrg { 2666 1.1 mrg lhs = gimple_assign_lhs (stmt); 2667 1.1 mrg gcc_assert (TREE_CODE (lhs) == SSA_NAME); 2668 1.1 mrg 2669 1.1 mrg next = find_use_stmt (&lhs); 2670 1.1 mrg if (!next 2671 1.1 mrg || gimple_assign_rhs_code (next) != code) 2672 1.1 mrg break; 2673 1.1 mrg 2674 1.1 mrg stmt = next; 2675 1.1 mrg dist++; 2676 1.1 mrg } 2677 1.1 mrg 2678 1.1 mrg if (distance) 2679 1.1 mrg *distance = dist; 2680 1.1 mrg return stmt; 2681 1.1 mrg } 2682 1.1 mrg 2683 1.1 mrg /* Returns the common statement in that NAME1 and NAME2 have a use. If there 2684 1.1 mrg is no such statement, returns NULL_TREE. In case the operation used on 2685 1.1 mrg NAME1 and NAME2 is associative and commutative, returns the root of the 2686 1.1 mrg tree formed by this operation instead of the statement that uses NAME1 or 2687 1.1 mrg NAME2. */ 2688 1.1 mrg 2689 1.1 mrg gimple * 2690 1.1 mrg pcom_worker::find_common_use_stmt (tree *name1, tree *name2) 2691 1.1 mrg { 2692 1.1 mrg gimple *stmt1, *stmt2; 2693 1.1 mrg 2694 1.1 mrg stmt1 = find_use_stmt (name1); 2695 1.1 mrg if (!stmt1) 2696 1.1 mrg return NULL; 2697 1.1 mrg 2698 1.1 mrg stmt2 = find_use_stmt (name2); 2699 1.1 mrg if (!stmt2) 2700 1.1 mrg return NULL; 2701 1.1 mrg 2702 1.1 mrg if (stmt1 == stmt2) 2703 1.1 mrg return stmt1; 2704 1.1 mrg 2705 1.1 mrg stmt1 = find_associative_operation_root (stmt1, NULL); 2706 1.1 mrg if (!stmt1) 2707 1.1 mrg return NULL; 2708 1.1 mrg stmt2 = find_associative_operation_root (stmt2, NULL); 2709 1.1 mrg if (!stmt2) 2710 1.1 mrg return NULL; 2711 1.1 mrg 2712 1.1 mrg return (stmt1 == stmt2 ? stmt1 : NULL); 2713 1.1 mrg } 2714 1.1 mrg 2715 1.1 mrg /* Checks whether R1 and R2 are combined together using CODE, with the result 2716 1.1 mrg in RSLT_TYPE, in order R1 CODE R2 if SWAP is false and in order R2 CODE R1 2717 1.1 mrg if it is true. If CODE is ERROR_MARK, set these values instead. */ 2718 1.1 mrg 2719 1.1 mrg bool 2720 1.1 mrg pcom_worker::combinable_refs_p (dref r1, dref r2, 2721 1.1 mrg enum tree_code *code, bool *swap, tree *rslt_type) 2722 1.1 mrg { 2723 1.1 mrg enum tree_code acode; 2724 1.1 mrg bool aswap; 2725 1.1 mrg tree atype; 2726 1.1 mrg tree name1, name2; 2727 1.1 mrg gimple *stmt; 2728 1.1 mrg 2729 1.1 mrg name1 = name_for_ref (r1); 2730 1.1 mrg name2 = name_for_ref (r2); 2731 1.1 mrg gcc_assert (name1 != NULL_TREE && name2 != NULL_TREE); 2732 1.1 mrg 2733 1.1 mrg stmt = find_common_use_stmt (&name1, &name2); 2734 1.1 mrg 2735 1.1 mrg if (!stmt 2736 1.1 mrg /* A simple post-dominance check - make sure the combination 2737 1.1 mrg is executed under the same condition as the references. */ 2738 1.1 mrg || (gimple_bb (stmt) != gimple_bb (r1->stmt) 2739 1.1 mrg && gimple_bb (stmt) != gimple_bb (r2->stmt))) 2740 1.1 mrg return false; 2741 1.1 mrg 2742 1.1 mrg acode = gimple_assign_rhs_code (stmt); 2743 1.1 mrg aswap = (!commutative_tree_code (acode) 2744 1.1 mrg && gimple_assign_rhs1 (stmt) != name1); 2745 1.1 mrg atype = TREE_TYPE (gimple_assign_lhs (stmt)); 2746 1.1 mrg 2747 1.1 mrg if (*code == ERROR_MARK) 2748 1.1 mrg { 2749 1.1 mrg *code = acode; 2750 1.1 mrg *swap = aswap; 2751 1.1 mrg *rslt_type = atype; 2752 1.1 mrg return true; 2753 1.1 mrg } 2754 1.1 mrg 2755 1.1 mrg return (*code == acode 2756 1.1 mrg && *swap == aswap 2757 1.1 mrg && *rslt_type == atype); 2758 1.1 mrg } 2759 1.1 mrg 2760 1.1 mrg /* Remove OP from the operation on rhs of STMT, and replace STMT with 2761 1.1 mrg an assignment of the remaining operand. */ 2762 1.1 mrg 2763 1.1 mrg static void 2764 1.1 mrg remove_name_from_operation (gimple *stmt, tree op) 2765 1.1 mrg { 2766 1.1 mrg tree other_op; 2767 1.1 mrg gimple_stmt_iterator si; 2768 1.1 mrg 2769 1.1 mrg gcc_assert (is_gimple_assign (stmt)); 2770 1.1 mrg 2771 1.1 mrg if (gimple_assign_rhs1 (stmt) == op) 2772 1.1 mrg other_op = gimple_assign_rhs2 (stmt); 2773 1.1 mrg else 2774 1.1 mrg other_op = gimple_assign_rhs1 (stmt); 2775 1.1 mrg 2776 1.1 mrg si = gsi_for_stmt (stmt); 2777 1.1 mrg gimple_assign_set_rhs_from_tree (&si, other_op); 2778 1.1 mrg 2779 1.1 mrg /* We should not have reallocated STMT. */ 2780 1.1 mrg gcc_assert (gsi_stmt (si) == stmt); 2781 1.1 mrg 2782 1.1 mrg update_stmt (stmt); 2783 1.1 mrg } 2784 1.1 mrg 2785 1.1 mrg /* Reassociates the expression in that NAME1 and NAME2 are used so that they 2786 1.1 mrg are combined in a single statement, and returns this statement. */ 2787 1.1 mrg 2788 1.1 mrg gimple * 2789 1.1 mrg pcom_worker::reassociate_to_the_same_stmt (tree name1, tree name2) 2790 1.1 mrg { 2791 1.1 mrg gimple *stmt1, *stmt2, *root1, *root2, *s1, *s2; 2792 1.1 mrg gassign *new_stmt, *tmp_stmt; 2793 1.1 mrg tree new_name, tmp_name, var, r1, r2; 2794 1.1 mrg unsigned dist1, dist2; 2795 1.1 mrg enum tree_code code; 2796 1.1 mrg tree type = TREE_TYPE (name1); 2797 1.1 mrg gimple_stmt_iterator bsi; 2798 1.1 mrg 2799 1.1 mrg stmt1 = find_use_stmt (&name1); 2800 1.1 mrg stmt2 = find_use_stmt (&name2); 2801 1.1 mrg root1 = find_associative_operation_root (stmt1, &dist1); 2802 1.1 mrg root2 = find_associative_operation_root (stmt2, &dist2); 2803 1.1 mrg code = gimple_assign_rhs_code (stmt1); 2804 1.1 mrg 2805 1.1 mrg gcc_assert (root1 && root2 && root1 == root2 2806 1.1 mrg && code == gimple_assign_rhs_code (stmt2)); 2807 1.1 mrg 2808 1.1 mrg /* Find the root of the nearest expression in that both NAME1 and NAME2 2809 1.1 mrg are used. */ 2810 1.1 mrg r1 = name1; 2811 1.1 mrg s1 = stmt1; 2812 1.1 mrg r2 = name2; 2813 1.1 mrg s2 = stmt2; 2814 1.1 mrg 2815 1.1 mrg while (dist1 > dist2) 2816 1.1 mrg { 2817 1.1 mrg s1 = find_use_stmt (&r1); 2818 1.1 mrg r1 = gimple_assign_lhs (s1); 2819 1.1 mrg dist1--; 2820 1.1 mrg } 2821 1.1 mrg while (dist2 > dist1) 2822 1.1 mrg { 2823 1.1 mrg s2 = find_use_stmt (&r2); 2824 1.1 mrg r2 = gimple_assign_lhs (s2); 2825 1.1 mrg dist2--; 2826 1.1 mrg } 2827 1.1 mrg 2828 1.1 mrg while (s1 != s2) 2829 1.1 mrg { 2830 1.1 mrg s1 = find_use_stmt (&r1); 2831 1.1 mrg r1 = gimple_assign_lhs (s1); 2832 1.1 mrg s2 = find_use_stmt (&r2); 2833 1.1 mrg r2 = gimple_assign_lhs (s2); 2834 1.1 mrg } 2835 1.1 mrg 2836 1.1 mrg /* Remove NAME1 and NAME2 from the statements in that they are used 2837 1.1 mrg currently. */ 2838 1.1 mrg remove_name_from_operation (stmt1, name1); 2839 1.1 mrg remove_name_from_operation (stmt2, name2); 2840 1.1 mrg 2841 1.1 mrg /* Insert the new statement combining NAME1 and NAME2 before S1, and 2842 1.1 mrg combine it with the rhs of S1. */ 2843 1.1 mrg var = create_tmp_reg (type, "predreastmp"); 2844 1.1 mrg new_name = make_ssa_name (var); 2845 1.1 mrg new_stmt = gimple_build_assign (new_name, code, name1, name2); 2846 1.1 mrg 2847 1.1 mrg var = create_tmp_reg (type, "predreastmp"); 2848 1.1 mrg tmp_name = make_ssa_name (var); 2849 1.1 mrg 2850 1.1 mrg /* Rhs of S1 may now be either a binary expression with operation 2851 1.1 mrg CODE, or gimple_val (in case that stmt1 == s1 or stmt2 == s1, 2852 1.1 mrg so that name1 or name2 was removed from it). */ 2853 1.1 mrg tmp_stmt = gimple_build_assign (tmp_name, gimple_assign_rhs_code (s1), 2854 1.1 mrg gimple_assign_rhs1 (s1), 2855 1.1 mrg gimple_assign_rhs2 (s1)); 2856 1.1 mrg 2857 1.1 mrg bsi = gsi_for_stmt (s1); 2858 1.1 mrg gimple_assign_set_rhs_with_ops (&bsi, code, new_name, tmp_name); 2859 1.1 mrg s1 = gsi_stmt (bsi); 2860 1.1 mrg update_stmt (s1); 2861 1.1 mrg 2862 1.1 mrg gsi_insert_before (&bsi, new_stmt, GSI_SAME_STMT); 2863 1.1 mrg gsi_insert_before (&bsi, tmp_stmt, GSI_SAME_STMT); 2864 1.1 mrg 2865 1.1 mrg return new_stmt; 2866 1.1 mrg } 2867 1.1 mrg 2868 1.1 mrg /* Returns the statement that combines references R1 and R2. In case R1 2869 1.1 mrg and R2 are not used in the same statement, but they are used with an 2870 1.1 mrg associative and commutative operation in the same expression, reassociate 2871 1.1 mrg the expression so that they are used in the same statement. */ 2872 1.1 mrg 2873 1.1 mrg gimple * 2874 1.1 mrg pcom_worker::stmt_combining_refs (dref r1, dref r2) 2875 1.1 mrg { 2876 1.1 mrg gimple *stmt1, *stmt2; 2877 1.1 mrg tree name1 = name_for_ref (r1); 2878 1.1 mrg tree name2 = name_for_ref (r2); 2879 1.1 mrg 2880 1.1 mrg stmt1 = find_use_stmt (&name1); 2881 1.1 mrg stmt2 = find_use_stmt (&name2); 2882 1.1 mrg if (stmt1 == stmt2) 2883 1.1 mrg return stmt1; 2884 1.1 mrg 2885 1.1 mrg return reassociate_to_the_same_stmt (name1, name2); 2886 1.1 mrg } 2887 1.1 mrg 2888 1.1 mrg /* Tries to combine chains CH1 and CH2 together. If this succeeds, the 2889 1.1 mrg description of the new chain is returned, otherwise we return NULL. */ 2890 1.1 mrg 2891 1.1 mrg chain_p 2892 1.1 mrg pcom_worker::combine_chains (chain_p ch1, chain_p ch2) 2893 1.1 mrg { 2894 1.1 mrg dref r1, r2, nw; 2895 1.1 mrg enum tree_code op = ERROR_MARK; 2896 1.1 mrg bool swap = false; 2897 1.1 mrg chain_p new_chain; 2898 1.1 mrg unsigned i; 2899 1.1 mrg tree rslt_type = NULL_TREE; 2900 1.1 mrg 2901 1.1 mrg if (ch1 == ch2) 2902 1.1 mrg return NULL; 2903 1.1 mrg if (ch1->length != ch2->length) 2904 1.1 mrg return NULL; 2905 1.1 mrg 2906 1.1 mrg if (ch1->refs.length () != ch2->refs.length ()) 2907 1.1 mrg return NULL; 2908 1.1 mrg 2909 1.1 mrg for (i = 0; (ch1->refs.iterate (i, &r1) 2910 1.1 mrg && ch2->refs.iterate (i, &r2)); i++) 2911 1.1 mrg { 2912 1.1 mrg if (r1->distance != r2->distance) 2913 1.1 mrg return NULL; 2914 1.1 mrg 2915 1.1 mrg if (!combinable_refs_p (r1, r2, &op, &swap, &rslt_type)) 2916 1.1 mrg return NULL; 2917 1.1 mrg } 2918 1.1 mrg 2919 1.1 mrg if (swap) 2920 1.1 mrg std::swap (ch1, ch2); 2921 1.1 mrg 2922 1.1 mrg new_chain = new struct chain (CT_COMBINATION); 2923 1.1 mrg new_chain->op = op; 2924 1.1 mrg new_chain->ch1 = ch1; 2925 1.1 mrg new_chain->ch2 = ch2; 2926 1.1 mrg new_chain->rslt_type = rslt_type; 2927 1.1 mrg new_chain->length = ch1->length; 2928 1.1 mrg 2929 1.1 mrg for (i = 0; (ch1->refs.iterate (i, &r1) 2930 1.1 mrg && ch2->refs.iterate (i, &r2)); i++) 2931 1.1 mrg { 2932 1.1 mrg nw = XCNEW (class dref_d); 2933 1.1 mrg nw->stmt = stmt_combining_refs (r1, r2); 2934 1.1 mrg nw->distance = r1->distance; 2935 1.1 mrg 2936 1.1 mrg new_chain->refs.safe_push (nw); 2937 1.1 mrg } 2938 1.1 mrg 2939 1.1 mrg ch1->combined = true; 2940 1.1 mrg ch2->combined = true; 2941 1.1 mrg return new_chain; 2942 1.1 mrg } 2943 1.1 mrg 2944 1.1 mrg /* Recursively update position information of all offspring chains to ROOT 2945 1.1 mrg chain's position information. */ 2946 1.1 mrg 2947 1.1 mrg static void 2948 1.1 mrg update_pos_for_combined_chains (chain_p root) 2949 1.1 mrg { 2950 1.1 mrg chain_p ch1 = root->ch1, ch2 = root->ch2; 2951 1.1 mrg dref ref, ref1, ref2; 2952 1.1 mrg for (unsigned j = 0; (root->refs.iterate (j, &ref) 2953 1.1 mrg && ch1->refs.iterate (j, &ref1) 2954 1.1 mrg && ch2->refs.iterate (j, &ref2)); ++j) 2955 1.1 mrg ref1->pos = ref2->pos = ref->pos; 2956 1.1 mrg 2957 1.1 mrg if (ch1->type == CT_COMBINATION) 2958 1.1 mrg update_pos_for_combined_chains (ch1); 2959 1.1 mrg if (ch2->type == CT_COMBINATION) 2960 1.1 mrg update_pos_for_combined_chains (ch2); 2961 1.1 mrg } 2962 1.1 mrg 2963 1.1 mrg /* Returns true if statement S1 dominates statement S2. */ 2964 1.1 mrg 2965 1.1 mrg static bool 2966 1.1 mrg pcom_stmt_dominates_stmt_p (gimple *s1, gimple *s2) 2967 1.1 mrg { 2968 1.1 mrg basic_block bb1 = gimple_bb (s1), bb2 = gimple_bb (s2); 2969 1.1 mrg 2970 1.1 mrg if (!bb1 || s1 == s2) 2971 1.1 mrg return true; 2972 1.1 mrg 2973 1.1 mrg if (bb1 == bb2) 2974 1.1 mrg return gimple_uid (s1) < gimple_uid (s2); 2975 1.1 mrg 2976 1.1 mrg return dominated_by_p (CDI_DOMINATORS, bb2, bb1); 2977 1.1 mrg } 2978 1.1 mrg 2979 1.1 mrg /* Try to combine the chains. */ 2980 1.1 mrg 2981 1.1 mrg void 2982 1.1 mrg pcom_worker::try_combine_chains () 2983 1.1 mrg { 2984 1.1 mrg unsigned i, j; 2985 1.1 mrg chain_p ch1, ch2, cch; 2986 1.1 mrg auto_vec<chain_p> worklist; 2987 1.1 mrg bool combined_p = false; 2988 1.1 mrg 2989 1.1 mrg FOR_EACH_VEC_ELT (m_chains, i, ch1) 2990 1.1 mrg if (chain_can_be_combined_p (ch1)) 2991 1.1 mrg worklist.safe_push (ch1); 2992 1.1 mrg 2993 1.1 mrg while (!worklist.is_empty ()) 2994 1.1 mrg { 2995 1.1 mrg ch1 = worklist.pop (); 2996 1.1 mrg if (!chain_can_be_combined_p (ch1)) 2997 1.1 mrg continue; 2998 1.1 mrg 2999 1.1 mrg FOR_EACH_VEC_ELT (m_chains, j, ch2) 3000 1.1 mrg { 3001 1.1 mrg if (!chain_can_be_combined_p (ch2)) 3002 1.1 mrg continue; 3003 1.1 mrg 3004 1.1 mrg cch = combine_chains (ch1, ch2); 3005 1.1 mrg if (cch) 3006 1.1 mrg { 3007 1.1 mrg worklist.safe_push (cch); 3008 1.1 mrg m_chains.safe_push (cch); 3009 1.1 mrg combined_p = true; 3010 1.1 mrg break; 3011 1.1 mrg } 3012 1.1 mrg } 3013 1.1 mrg } 3014 1.1 mrg if (!combined_p) 3015 1.1 mrg return; 3016 1.1 mrg 3017 1.1 mrg /* Setup UID for all statements in dominance order. */ 3018 1.1 mrg basic_block *bbs = get_loop_body_in_dom_order (m_loop); 3019 1.1 mrg renumber_gimple_stmt_uids_in_blocks (bbs, m_loop->num_nodes); 3020 1.1 mrg free (bbs); 3021 1.1 mrg 3022 1.1 mrg /* Re-association in combined chains may generate statements different to 3023 1.1 mrg order of references of the original chain. We need to keep references 3024 1.1 mrg of combined chain in dominance order so that all uses will be inserted 3025 1.1 mrg after definitions. Note: 3026 1.1 mrg A) This is necessary for all combined chains. 3027 1.1 mrg B) This is only necessary for ZERO distance references because other 3028 1.1 mrg references inherit value from loop carried PHIs. 3029 1.1 mrg 3030 1.1 mrg We first update position information for all combined chains. */ 3031 1.1 mrg dref ref; 3032 1.1 mrg for (i = 0; m_chains.iterate (i, &ch1); ++i) 3033 1.1 mrg { 3034 1.1 mrg if (ch1->type != CT_COMBINATION || ch1->combined) 3035 1.1 mrg continue; 3036 1.1 mrg 3037 1.1 mrg for (j = 0; ch1->refs.iterate (j, &ref); ++j) 3038 1.1 mrg ref->pos = gimple_uid (ref->stmt); 3039 1.1 mrg 3040 1.1 mrg update_pos_for_combined_chains (ch1); 3041 1.1 mrg } 3042 1.1 mrg /* Then sort references according to newly updated position information. */ 3043 1.1 mrg for (i = 0; m_chains.iterate (i, &ch1); ++i) 3044 1.1 mrg { 3045 1.1 mrg if (ch1->type != CT_COMBINATION && !ch1->combined) 3046 1.1 mrg continue; 3047 1.1 mrg 3048 1.1 mrg /* Find the first reference with non-ZERO distance. */ 3049 1.1 mrg if (ch1->length == 0) 3050 1.1 mrg j = ch1->refs.length(); 3051 1.1 mrg else 3052 1.1 mrg { 3053 1.1 mrg for (j = 0; ch1->refs.iterate (j, &ref); ++j) 3054 1.1 mrg if (ref->distance != 0) 3055 1.1 mrg break; 3056 1.1 mrg } 3057 1.1 mrg 3058 1.1 mrg /* Sort all ZERO distance references by position. */ 3059 1.1 mrg qsort (&ch1->refs[0], j, sizeof (ch1->refs[0]), order_drefs_by_pos); 3060 1.1 mrg 3061 1.1 mrg if (ch1->combined) 3062 1.1 mrg continue; 3063 1.1 mrg 3064 1.1 mrg /* For ZERO length chain, has_max_use_after must be true since root 3065 1.1 mrg combined stmt must dominates others. */ 3066 1.1 mrg if (ch1->length == 0) 3067 1.1 mrg { 3068 1.1 mrg ch1->has_max_use_after = true; 3069 1.1 mrg continue; 3070 1.1 mrg } 3071 1.1 mrg /* Check if there is use at max distance after root for combined chains 3072 1.1 mrg and set flag accordingly. */ 3073 1.1 mrg ch1->has_max_use_after = false; 3074 1.1 mrg gimple *root_stmt = get_chain_root (ch1)->stmt; 3075 1.1 mrg for (j = 1; ch1->refs.iterate (j, &ref); ++j) 3076 1.1 mrg { 3077 1.1 mrg if (ref->distance == ch1->length 3078 1.1 mrg && !pcom_stmt_dominates_stmt_p (ref->stmt, root_stmt)) 3079 1.1 mrg { 3080 1.1 mrg ch1->has_max_use_after = true; 3081 1.1 mrg break; 3082 1.1 mrg } 3083 1.1 mrg } 3084 1.1 mrg } 3085 1.1 mrg } 3086 1.1 mrg 3087 1.1 mrg /* Prepare initializers for store elimination CHAIN in LOOP. Returns false 3088 1.1 mrg if this is impossible because one of these initializers may trap, true 3089 1.1 mrg otherwise. */ 3090 1.1 mrg 3091 1.1 mrg static bool 3092 1.1 mrg prepare_initializers_chain_store_elim (class loop *loop, chain_p chain) 3093 1.1 mrg { 3094 1.1 mrg unsigned i, n = chain->length; 3095 1.1 mrg 3096 1.1 mrg /* For now we can't eliminate stores if some of them are conditional 3097 1.1 mrg executed. */ 3098 1.1 mrg if (!chain->all_always_accessed) 3099 1.1 mrg return false; 3100 1.1 mrg 3101 1.1 mrg /* Nothing to intialize for intra-iteration store elimination. */ 3102 1.1 mrg if (n == 0 && chain->type == CT_STORE_STORE) 3103 1.1 mrg return true; 3104 1.1 mrg 3105 1.1 mrg /* For store elimination chain, there is nothing to initialize if stores 3106 1.1 mrg to be eliminated only store loop invariant values into memory. */ 3107 1.1 mrg if (chain->type == CT_STORE_STORE 3108 1.1 mrg && is_inv_store_elimination_chain (loop, chain)) 3109 1.1 mrg { 3110 1.1 mrg chain->inv_store_elimination = true; 3111 1.1 mrg return true; 3112 1.1 mrg } 3113 1.1 mrg 3114 1.1 mrg chain->inits.create (n); 3115 1.1 mrg chain->inits.safe_grow_cleared (n, true); 3116 1.1 mrg 3117 1.1 mrg /* For store eliminatin chain like below: 3118 1.1 mrg 3119 1.1 mrg for (i = 0; i < len; i++) 3120 1.1 mrg { 3121 1.1 mrg a[i] = 1; 3122 1.1 mrg // a[i + 1] = ... 3123 1.1 mrg a[i + 2] = 3; 3124 1.1 mrg } 3125 1.1 mrg 3126 1.1 mrg store to a[i + 1] is missed in loop body, it acts like bubbles. The 3127 1.1 mrg content of a[i + 1] remain the same if the loop iterates fewer times 3128 1.1 mrg than chain->length. We need to set up root variables for such stores 3129 1.1 mrg by loading from memory before loop. Note we only need to load bubble 3130 1.1 mrg elements because loop body is guaranteed to be executed at least once 3131 1.1 mrg after loop's preheader edge. */ 3132 1.1 mrg auto_vec<bool> bubbles; 3133 1.1 mrg bubbles.safe_grow_cleared (n + 1, true); 3134 1.1 mrg for (i = 0; i < chain->refs.length (); i++) 3135 1.1 mrg bubbles[chain->refs[i]->distance] = true; 3136 1.1 mrg 3137 1.1 mrg struct data_reference *dr = get_chain_root (chain)->ref; 3138 1.1 mrg for (i = 0; i < n; i++) 3139 1.1 mrg { 3140 1.1 mrg if (bubbles[i]) 3141 1.1 mrg continue; 3142 1.1 mrg 3143 1.1 mrg gimple_seq stmts = NULL; 3144 1.1 mrg 3145 1.1 mrg tree init = ref_at_iteration (dr, (int) 0 - i, &stmts); 3146 1.1 mrg if (stmts) 3147 1.1 mrg gimple_seq_add_seq_without_update (&chain->init_seq, stmts); 3148 1.1 mrg 3149 1.1 mrg chain->inits[i] = init; 3150 1.1 mrg } 3151 1.1 mrg 3152 1.1 mrg return true; 3153 1.1 mrg } 3154 1.1 mrg 3155 1.1 mrg /* Prepare initializers for CHAIN. Returns false if this is impossible 3156 1.1 mrg because one of these initializers may trap, true otherwise. */ 3157 1.1 mrg 3158 1.1 mrg bool 3159 1.1 mrg pcom_worker::prepare_initializers_chain (chain_p chain) 3160 1.1 mrg { 3161 1.1 mrg unsigned i, n = (chain->type == CT_INVARIANT) ? 1 : chain->length; 3162 1.1 mrg struct data_reference *dr = get_chain_root (chain)->ref; 3163 1.1 mrg tree init; 3164 1.1 mrg dref laref; 3165 1.1 mrg edge entry = loop_preheader_edge (m_loop); 3166 1.1 mrg 3167 1.1 mrg if (chain->type == CT_STORE_STORE) 3168 1.1 mrg return prepare_initializers_chain_store_elim (m_loop, chain); 3169 1.1 mrg 3170 1.1 mrg /* Find the initializers for the variables, and check that they cannot 3171 1.1 mrg trap. */ 3172 1.1 mrg chain->inits.create (n); 3173 1.1 mrg for (i = 0; i < n; i++) 3174 1.1 mrg chain->inits.quick_push (NULL_TREE); 3175 1.1 mrg 3176 1.1 mrg /* If we have replaced some looparound phi nodes, use their initializers 3177 1.1 mrg instead of creating our own. */ 3178 1.1 mrg FOR_EACH_VEC_ELT (chain->refs, i, laref) 3179 1.1 mrg { 3180 1.1 mrg if (gimple_code (laref->stmt) != GIMPLE_PHI) 3181 1.1 mrg continue; 3182 1.1 mrg 3183 1.1 mrg gcc_assert (laref->distance > 0); 3184 1.1 mrg chain->inits[n - laref->distance] 3185 1.1 mrg = PHI_ARG_DEF_FROM_EDGE (laref->stmt, entry); 3186 1.1 mrg } 3187 1.1 mrg 3188 1.1 mrg for (i = 0; i < n; i++) 3189 1.1 mrg { 3190 1.1 mrg gimple_seq stmts = NULL; 3191 1.1 mrg 3192 1.1 mrg if (chain->inits[i] != NULL_TREE) 3193 1.1 mrg continue; 3194 1.1 mrg 3195 1.1 mrg init = ref_at_iteration (dr, (int) i - n, &stmts); 3196 1.1 mrg if (!chain->all_always_accessed && tree_could_trap_p (init)) 3197 1.1 mrg { 3198 1.1 mrg gimple_seq_discard (stmts); 3199 1.1 mrg return false; 3200 1.1 mrg } 3201 1.1 mrg 3202 1.1 mrg if (stmts) 3203 1.1 mrg gimple_seq_add_seq_without_update (&chain->init_seq, stmts); 3204 1.1 mrg 3205 1.1 mrg chain->inits[i] = init; 3206 1.1 mrg } 3207 1.1 mrg 3208 1.1 mrg return true; 3209 1.1 mrg } 3210 1.1 mrg 3211 1.1 mrg /* Prepare initializers for chains, and free chains that cannot 3212 1.1 mrg be used because the initializers might trap. */ 3213 1.1 mrg 3214 1.1 mrg void 3215 1.1 mrg pcom_worker::prepare_initializers () 3216 1.1 mrg { 3217 1.1 mrg chain_p chain; 3218 1.1 mrg unsigned i; 3219 1.1 mrg 3220 1.1 mrg for (i = 0; i < m_chains.length (); ) 3221 1.1 mrg { 3222 1.1 mrg chain = m_chains[i]; 3223 1.1 mrg if (prepare_initializers_chain (chain)) 3224 1.1 mrg i++; 3225 1.1 mrg else 3226 1.1 mrg { 3227 1.1 mrg release_chain (chain); 3228 1.1 mrg m_chains.unordered_remove (i); 3229 1.1 mrg } 3230 1.1 mrg } 3231 1.1 mrg } 3232 1.1 mrg 3233 1.1 mrg /* Generates finalizer memory references for CHAIN. Returns true 3234 1.1 mrg if finalizer code for CHAIN can be generated, otherwise false. */ 3235 1.1 mrg 3236 1.1 mrg bool 3237 1.1 mrg pcom_worker::prepare_finalizers_chain (chain_p chain) 3238 1.1 mrg { 3239 1.1 mrg unsigned i, n = chain->length; 3240 1.1 mrg struct data_reference *dr = get_chain_root (chain)->ref; 3241 1.1 mrg tree fini, niters = number_of_latch_executions (m_loop); 3242 1.1 mrg 3243 1.1 mrg /* For now we can't eliminate stores if some of them are conditional 3244 1.1 mrg executed. */ 3245 1.1 mrg if (!chain->all_always_accessed) 3246 1.1 mrg return false; 3247 1.1 mrg 3248 1.1 mrg chain->finis.create (n); 3249 1.1 mrg for (i = 0; i < n; i++) 3250 1.1 mrg chain->finis.quick_push (NULL_TREE); 3251 1.1 mrg 3252 1.1 mrg /* We never use looparound phi node for store elimination chains. */ 3253 1.1 mrg 3254 1.1 mrg /* Find the finalizers for the variables, and check that they cannot 3255 1.1 mrg trap. */ 3256 1.1 mrg for (i = 0; i < n; i++) 3257 1.1 mrg { 3258 1.1 mrg gimple_seq stmts = NULL; 3259 1.1 mrg gcc_assert (chain->finis[i] == NULL_TREE); 3260 1.1 mrg 3261 1.1 mrg if (TREE_CODE (niters) != INTEGER_CST && TREE_CODE (niters) != SSA_NAME) 3262 1.1 mrg { 3263 1.1 mrg niters = unshare_expr (niters); 3264 1.1 mrg niters = force_gimple_operand (niters, &stmts, true, NULL); 3265 1.1 mrg if (stmts) 3266 1.1 mrg { 3267 1.1 mrg gimple_seq_add_seq_without_update (&chain->fini_seq, stmts); 3268 1.1 mrg stmts = NULL; 3269 1.1 mrg } 3270 1.1 mrg } 3271 1.1 mrg fini = ref_at_iteration (dr, (int) 0 - i, &stmts, niters); 3272 1.1 mrg if (stmts) 3273 1.1 mrg gimple_seq_add_seq_without_update (&chain->fini_seq, stmts); 3274 1.1 mrg 3275 1.1 mrg chain->finis[i] = fini; 3276 1.1 mrg } 3277 1.1 mrg 3278 1.1 mrg return true; 3279 1.1 mrg } 3280 1.1 mrg 3281 1.1 mrg /* Generates finalizer memory reference for chains. Returns true if 3282 1.1 mrg finalizer code generation for chains breaks loop closed ssa form. */ 3283 1.1 mrg 3284 1.1 mrg bool 3285 1.1 mrg pcom_worker::prepare_finalizers () 3286 1.1 mrg { 3287 1.1 mrg chain_p chain; 3288 1.1 mrg unsigned i; 3289 1.1 mrg bool loop_closed_ssa = false; 3290 1.1 mrg 3291 1.1 mrg for (i = 0; i < m_chains.length ();) 3292 1.1 mrg { 3293 1.1 mrg chain = m_chains[i]; 3294 1.1 mrg 3295 1.1 mrg /* Finalizer is only necessary for inter-iteration store elimination 3296 1.1 mrg chains. */ 3297 1.1 mrg if (chain->length == 0 || chain->type != CT_STORE_STORE) 3298 1.1 mrg { 3299 1.1 mrg i++; 3300 1.1 mrg continue; 3301 1.1 mrg } 3302 1.1 mrg 3303 1.1 mrg if (prepare_finalizers_chain (chain)) 3304 1.1 mrg { 3305 1.1 mrg i++; 3306 1.1 mrg /* Be conservative, assume loop closed ssa form is corrupted 3307 1.1 mrg by store-store chain. Though it's not always the case if 3308 1.1 mrg eliminated stores only store loop invariant values into 3309 1.1 mrg memory. */ 3310 1.1 mrg loop_closed_ssa = true; 3311 1.1 mrg } 3312 1.1 mrg else 3313 1.1 mrg { 3314 1.1 mrg release_chain (chain); 3315 1.1 mrg m_chains.unordered_remove (i); 3316 1.1 mrg } 3317 1.1 mrg } 3318 1.1 mrg return loop_closed_ssa; 3319 1.1 mrg } 3320 1.1 mrg 3321 1.1 mrg /* Insert all initializing gimple stmts into LOOP's entry edge. */ 3322 1.1 mrg 3323 1.1 mrg static void 3324 1.1 mrg insert_init_seqs (class loop *loop, vec<chain_p> &chains) 3325 1.1 mrg { 3326 1.1 mrg unsigned i; 3327 1.1 mrg edge entry = loop_preheader_edge (loop); 3328 1.1 mrg 3329 1.1 mrg for (i = 0; i < chains.length (); ++i) 3330 1.1 mrg if (chains[i]->init_seq) 3331 1.1 mrg { 3332 1.1 mrg gsi_insert_seq_on_edge_immediate (entry, chains[i]->init_seq); 3333 1.1 mrg chains[i]->init_seq = NULL; 3334 1.1 mrg } 3335 1.1 mrg } 3336 1.1 mrg 3337 1.1 mrg /* Performs predictive commoning for LOOP. Sets bit 1<<1 of return value 3338 1.1 mrg if LOOP was unrolled; Sets bit 1<<2 of return value if loop closed ssa 3339 1.1 mrg form was corrupted. Non-zero return value indicates some changes were 3340 1.1 mrg applied to this loop. */ 3341 1.1 mrg 3342 1.1 mrg unsigned 3343 1.1 mrg pcom_worker::tree_predictive_commoning_loop (bool allow_unroll_p) 3344 1.1 mrg { 3345 1.1 mrg struct component *components; 3346 1.1 mrg unsigned unroll_factor = 0; 3347 1.1 mrg class tree_niter_desc desc; 3348 1.1 mrg bool unroll = false, loop_closed_ssa = false; 3349 1.1 mrg 3350 1.1 mrg if (dump_file && (dump_flags & TDF_DETAILS)) 3351 1.1 mrg fprintf (dump_file, "Processing loop %d\n", m_loop->num); 3352 1.1 mrg 3353 1.1 mrg /* Nothing for predicitive commoning if loop only iterates 1 time. */ 3354 1.1 mrg if (get_max_loop_iterations_int (m_loop) == 0) 3355 1.1 mrg { 3356 1.1 mrg if (dump_file && (dump_flags & TDF_DETAILS)) 3357 1.1 mrg fprintf (dump_file, "Loop iterates only 1 time, nothing to do.\n"); 3358 1.1 mrg 3359 1.1 mrg return 0; 3360 1.1 mrg } 3361 1.1 mrg 3362 1.1 mrg /* Find the data references and split them into components according to their 3363 1.1 mrg dependence relations. */ 3364 1.1 mrg auto_vec<loop_p, 3> loop_nest; 3365 1.1 mrg if (!compute_data_dependences_for_loop (m_loop, true, &loop_nest, &m_datarefs, 3366 1.1 mrg &m_dependences)) 3367 1.1 mrg { 3368 1.1 mrg if (dump_file && (dump_flags & TDF_DETAILS)) 3369 1.1 mrg fprintf (dump_file, "Cannot analyze data dependencies\n"); 3370 1.1 mrg return 0; 3371 1.1 mrg } 3372 1.1 mrg 3373 1.1 mrg if (dump_file && (dump_flags & TDF_DETAILS)) 3374 1.1 mrg dump_data_dependence_relations (dump_file, m_dependences); 3375 1.1 mrg 3376 1.1 mrg components = split_data_refs_to_components (); 3377 1.1 mrg 3378 1.1 mrg loop_nest.release (); 3379 1.1 mrg if (!components) 3380 1.1 mrg return 0; 3381 1.1 mrg 3382 1.1 mrg if (dump_file && (dump_flags & TDF_DETAILS)) 3383 1.1 mrg { 3384 1.1 mrg fprintf (dump_file, "Initial state:\n\n"); 3385 1.1 mrg dump_components (dump_file, components); 3386 1.1 mrg } 3387 1.1 mrg 3388 1.1 mrg /* Find the suitable components and split them into chains. */ 3389 1.1 mrg components = filter_suitable_components (components); 3390 1.1 mrg 3391 1.1 mrg auto_bitmap tmp_vars; 3392 1.1 mrg determine_roots (components); 3393 1.1 mrg release_components (components); 3394 1.1 mrg 3395 1.1 mrg if (!m_chains.exists ()) 3396 1.1 mrg { 3397 1.1 mrg if (dump_file && (dump_flags & TDF_DETAILS)) 3398 1.1 mrg fprintf (dump_file, 3399 1.1 mrg "Predictive commoning failed: no suitable chains\n"); 3400 1.1 mrg return 0; 3401 1.1 mrg } 3402 1.1 mrg 3403 1.1 mrg prepare_initializers (); 3404 1.1 mrg loop_closed_ssa = prepare_finalizers (); 3405 1.1 mrg 3406 1.1 mrg /* Try to combine the chains that are always worked with together. */ 3407 1.1 mrg try_combine_chains (); 3408 1.1 mrg 3409 1.1 mrg insert_init_seqs (m_loop, m_chains); 3410 1.1 mrg 3411 1.1 mrg if (dump_file && (dump_flags & TDF_DETAILS)) 3412 1.1 mrg { 3413 1.1 mrg fprintf (dump_file, "Before commoning:\n\n"); 3414 1.1 mrg dump_chains (dump_file, m_chains); 3415 1.1 mrg } 3416 1.1 mrg 3417 1.1 mrg if (allow_unroll_p) 3418 1.1 mrg /* Determine the unroll factor, and if the loop should be unrolled, ensure 3419 1.1 mrg that its number of iterations is divisible by the factor. */ 3420 1.1 mrg unroll_factor = determine_unroll_factor (m_chains); 3421 1.1 mrg 3422 1.1 mrg if (unroll_factor > 1) 3423 1.1 mrg unroll = can_unroll_loop_p (m_loop, unroll_factor, &desc); 3424 1.1 mrg 3425 1.1 mrg /* Execute the predictive commoning transformations, and possibly unroll the 3426 1.1 mrg loop. */ 3427 1.1 mrg if (unroll) 3428 1.1 mrg { 3429 1.1 mrg struct epcc_data dta; 3430 1.1 mrg 3431 1.1 mrg if (dump_file && (dump_flags & TDF_DETAILS)) 3432 1.1 mrg fprintf (dump_file, "Unrolling %u times.\n", unroll_factor); 3433 1.1 mrg 3434 1.1 mrg dta.tmp_vars = tmp_vars; 3435 1.1 mrg dta.chains = m_chains.to_vec_legacy (); 3436 1.1 mrg dta.worker = this; 3437 1.1 mrg 3438 1.1 mrg /* Cfg manipulations performed in tree_transform_and_unroll_loop before 3439 1.1 mrg execute_pred_commoning_cbck is called may cause phi nodes to be 3440 1.1 mrg reallocated, which is a problem since CHAINS may point to these 3441 1.1 mrg statements. To fix this, we store the ssa names defined by the 3442 1.1 mrg phi nodes here instead of the phi nodes themselves, and restore 3443 1.1 mrg the phi nodes in execute_pred_commoning_cbck. A bit hacky. */ 3444 1.1 mrg replace_phis_by_defined_names (m_chains); 3445 1.1 mrg 3446 1.1 mrg tree_transform_and_unroll_loop (m_loop, unroll_factor, &desc, 3447 1.1 mrg execute_pred_commoning_cbck, &dta); 3448 1.1 mrg eliminate_temp_copies (m_loop, tmp_vars); 3449 1.1 mrg } 3450 1.1 mrg else 3451 1.1 mrg { 3452 1.1 mrg if (dump_file && (dump_flags & TDF_DETAILS)) 3453 1.1 mrg fprintf (dump_file, 3454 1.1 mrg "Executing predictive commoning without unrolling.\n"); 3455 1.1 mrg execute_pred_commoning (tmp_vars); 3456 1.1 mrg } 3457 1.1 mrg 3458 1.1 mrg return (unroll ? 2 : 1) | (loop_closed_ssa ? 4 : 1); 3459 1.1 mrg } 3460 1.1 mrg 3461 1.1 mrg /* Runs predictive commoning. */ 3462 1.1 mrg 3463 1.1 mrg unsigned 3464 1.1 mrg tree_predictive_commoning (bool allow_unroll_p) 3465 1.1 mrg { 3466 1.1 mrg unsigned ret = 0, changed = 0; 3467 1.1 mrg 3468 1.1 mrg initialize_original_copy_tables (); 3469 1.1 mrg for (auto loop : loops_list (cfun, LI_ONLY_INNERMOST)) 3470 1.1 mrg if (optimize_loop_for_speed_p (loop)) 3471 1.1 mrg { 3472 1.1 mrg pcom_worker w(loop); 3473 1.1 mrg changed |= w.tree_predictive_commoning_loop (allow_unroll_p); 3474 1.1 mrg } 3475 1.1 mrg free_original_copy_tables (); 3476 1.1 mrg 3477 1.1 mrg if (changed > 0) 3478 1.1 mrg { 3479 1.1 mrg ret = TODO_update_ssa_only_virtuals; 3480 1.1 mrg 3481 1.1 mrg /* Some loop(s) got unrolled. */ 3482 1.1 mrg if (changed > 1) 3483 1.1 mrg { 3484 1.1 mrg scev_reset (); 3485 1.1 mrg 3486 1.1 mrg /* Need to fix up loop closed SSA. */ 3487 1.1 mrg if (changed >= 4) 3488 1.1 mrg rewrite_into_loop_closed_ssa (NULL, TODO_update_ssa); 3489 1.1 mrg 3490 1.1 mrg ret |= TODO_cleanup_cfg; 3491 1.1 mrg } 3492 1.1 mrg } 3493 1.1 mrg 3494 1.1 mrg return ret; 3495 1.1 mrg } 3496 1.1 mrg 3497 1.1 mrg /* Predictive commoning Pass. */ 3498 1.1 mrg 3499 1.1 mrg static unsigned 3500 1.1 mrg run_tree_predictive_commoning (struct function *fun, bool allow_unroll_p) 3501 1.1 mrg { 3502 1.1 mrg if (number_of_loops (fun) <= 1) 3503 1.1 mrg return 0; 3504 1.1 mrg 3505 1.1 mrg return tree_predictive_commoning (allow_unroll_p); 3506 1.1 mrg } 3507 1.1 mrg 3508 1.1 mrg namespace { 3509 1.1 mrg 3510 1.1 mrg const pass_data pass_data_predcom = 3511 1.1 mrg { 3512 1.1 mrg GIMPLE_PASS, /* type */ 3513 1.1 mrg "pcom", /* name */ 3514 1.1 mrg OPTGROUP_LOOP, /* optinfo_flags */ 3515 1.1 mrg TV_PREDCOM, /* tv_id */ 3516 1.1 mrg PROP_cfg, /* properties_required */ 3517 1.1 mrg 0, /* properties_provided */ 3518 1.1 mrg 0, /* properties_destroyed */ 3519 1.1 mrg 0, /* todo_flags_start */ 3520 1.1 mrg 0, /* todo_flags_finish */ 3521 1.1 mrg }; 3522 1.1 mrg 3523 1.1 mrg class pass_predcom : public gimple_opt_pass 3524 1.1 mrg { 3525 1.1 mrg public: 3526 1.1 mrg pass_predcom (gcc::context *ctxt) 3527 1.1 mrg : gimple_opt_pass (pass_data_predcom, ctxt) 3528 1.1 mrg {} 3529 1.1 mrg 3530 1.1 mrg /* opt_pass methods: */ 3531 1.1 mrg virtual bool 3532 1.1 mrg gate (function *) 3533 1.1 mrg { 3534 1.1 mrg if (flag_predictive_commoning != 0) 3535 1.1 mrg return true; 3536 1.1 mrg /* Loop vectorization enables predictive commoning implicitly 3537 1.1 mrg only if predictive commoning isn't set explicitly, and it 3538 1.1 mrg doesn't allow unrolling. */ 3539 1.1 mrg if (flag_tree_loop_vectorize 3540 1.1 mrg && !OPTION_SET_P (flag_predictive_commoning)) 3541 1.1 mrg return true; 3542 1.1 mrg 3543 1.1 mrg return false; 3544 1.1 mrg } 3545 1.1 mrg 3546 1.1 mrg virtual unsigned int 3547 1.1 mrg execute (function *fun) 3548 1.1 mrg { 3549 1.1 mrg bool allow_unroll_p = flag_predictive_commoning != 0; 3550 1.1 mrg return run_tree_predictive_commoning (fun, allow_unroll_p); 3551 1.1 mrg } 3552 1.1 mrg 3553 1.1 mrg }; // class pass_predcom 3554 1.1 mrg 3555 1.1 mrg } // anon namespace 3556 1.1 mrg 3557 1.1 mrg gimple_opt_pass * 3558 1.1 mrg make_pass_predcom (gcc::context *ctxt) 3559 1.1 mrg { 3560 1.1 mrg return new pass_predcom (ctxt); 3561 1.1 mrg } 3562