tree-predcom.cc revision 1.1 1 1.1 mrg /* Predictive commoning.
2 1.1 mrg Copyright (C) 2005-2022 Free Software Foundation, Inc.
3 1.1 mrg
4 1.1 mrg This file is part of GCC.
5 1.1 mrg
6 1.1 mrg GCC is free software; you can redistribute it and/or modify it
7 1.1 mrg under the terms of the GNU General Public License as published by the
8 1.1 mrg Free Software Foundation; either version 3, or (at your option) any
9 1.1 mrg later version.
10 1.1 mrg
11 1.1 mrg GCC is distributed in the hope that it will be useful, but WITHOUT
12 1.1 mrg ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
13 1.1 mrg FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
14 1.1 mrg for more details.
15 1.1 mrg
16 1.1 mrg You should have received a copy of the GNU General Public License
17 1.1 mrg along with GCC; see the file COPYING3. If not see
18 1.1 mrg <http://www.gnu.org/licenses/>. */
19 1.1 mrg
20 1.1 mrg /* This file implements the predictive commoning optimization. Predictive
21 1.1 mrg commoning can be viewed as CSE around a loop, and with some improvements,
22 1.1 mrg as generalized strength reduction-- i.e., reusing values computed in
23 1.1 mrg earlier iterations of a loop in the later ones. So far, the pass only
24 1.1 mrg handles the most useful case, that is, reusing values of memory references.
25 1.1 mrg If you think this is all just a special case of PRE, you are sort of right;
26 1.1 mrg however, concentrating on loops is simpler, and makes it possible to
27 1.1 mrg incorporate data dependence analysis to detect the opportunities, perform
28 1.1 mrg loop unrolling to avoid copies together with renaming immediately,
29 1.1 mrg and if needed, we could also take register pressure into account.
30 1.1 mrg
31 1.1 mrg Let us demonstrate what is done on an example:
32 1.1 mrg
33 1.1 mrg for (i = 0; i < 100; i++)
34 1.1 mrg {
35 1.1 mrg a[i+2] = a[i] + a[i+1];
36 1.1 mrg b[10] = b[10] + i;
37 1.1 mrg c[i] = c[99 - i];
38 1.1 mrg d[i] = d[i + 1];
39 1.1 mrg }
40 1.1 mrg
41 1.1 mrg 1) We find data references in the loop, and split them to mutually
42 1.1 mrg independent groups (i.e., we find components of a data dependence
43 1.1 mrg graph). We ignore read-read dependences whose distance is not constant.
44 1.1 mrg (TODO -- we could also ignore antidependences). In this example, we
45 1.1 mrg find the following groups:
46 1.1 mrg
47 1.1 mrg a[i]{read}, a[i+1]{read}, a[i+2]{write}
48 1.1 mrg b[10]{read}, b[10]{write}
49 1.1 mrg c[99 - i]{read}, c[i]{write}
50 1.1 mrg d[i + 1]{read}, d[i]{write}
51 1.1 mrg
52 1.1 mrg 2) Inside each of the group, we verify several conditions:
53 1.1 mrg a) all the references must differ in indices only, and the indices
54 1.1 mrg must all have the same step
55 1.1 mrg b) the references must dominate loop latch (and thus, they must be
56 1.1 mrg ordered by dominance relation).
57 1.1 mrg c) the distance of the indices must be a small multiple of the step
58 1.1 mrg We are then able to compute the difference of the references (# of
59 1.1 mrg iterations before they point to the same place as the first of them).
60 1.1 mrg Also, in case there are writes in the loop, we split the groups into
61 1.1 mrg chains whose head is the write whose values are used by the reads in
62 1.1 mrg the same chain. The chains are then processed independently,
63 1.1 mrg making the further transformations simpler. Also, the shorter chains
64 1.1 mrg need the same number of registers, but may require lower unrolling
65 1.1 mrg factor in order to get rid of the copies on the loop latch.
66 1.1 mrg
67 1.1 mrg In our example, we get the following chains (the chain for c is invalid).
68 1.1 mrg
69 1.1 mrg a[i]{read,+0}, a[i+1]{read,-1}, a[i+2]{write,-2}
70 1.1 mrg b[10]{read,+0}, b[10]{write,+0}
71 1.1 mrg d[i + 1]{read,+0}, d[i]{write,+1}
72 1.1 mrg
73 1.1 mrg 3) For each read, we determine the read or write whose value it reuses,
74 1.1 mrg together with the distance of this reuse. I.e. we take the last
75 1.1 mrg reference before it with distance 0, or the last of the references
76 1.1 mrg with the smallest positive distance to the read. Then, we remove
77 1.1 mrg the references that are not used in any of these chains, discard the
78 1.1 mrg empty groups, and propagate all the links so that they point to the
79 1.1 mrg single root reference of the chain (adjusting their distance
80 1.1 mrg appropriately). Some extra care needs to be taken for references with
81 1.1 mrg step 0. In our example (the numbers indicate the distance of the
82 1.1 mrg reuse),
83 1.1 mrg
84 1.1 mrg a[i] --> (*) 2, a[i+1] --> (*) 1, a[i+2] (*)
85 1.1 mrg b[10] --> (*) 1, b[10] (*)
86 1.1 mrg
87 1.1 mrg 4) The chains are combined together if possible. If the corresponding
88 1.1 mrg elements of two chains are always combined together with the same
89 1.1 mrg operator, we remember just the result of this combination, instead
90 1.1 mrg of remembering the values separately. We may need to perform
91 1.1 mrg reassociation to enable combining, for example
92 1.1 mrg
93 1.1 mrg e[i] + f[i+1] + e[i+1] + f[i]
94 1.1 mrg
95 1.1 mrg can be reassociated as
96 1.1 mrg
97 1.1 mrg (e[i] + f[i]) + (e[i+1] + f[i+1])
98 1.1 mrg
99 1.1 mrg and we can combine the chains for e and f into one chain.
100 1.1 mrg
101 1.1 mrg 5) For each root reference (end of the chain) R, let N be maximum distance
102 1.1 mrg of a reference reusing its value. Variables R0 up to RN are created,
103 1.1 mrg together with phi nodes that transfer values from R1 .. RN to
104 1.1 mrg R0 .. R(N-1).
105 1.1 mrg Initial values are loaded to R0..R(N-1) (in case not all references
106 1.1 mrg must necessarily be accessed and they may trap, we may fail here;
107 1.1 mrg TODO sometimes, the loads could be guarded by a check for the number
108 1.1 mrg of iterations). Values loaded/stored in roots are also copied to
109 1.1 mrg RN. Other reads are replaced with the appropriate variable Ri.
110 1.1 mrg Everything is put to SSA form.
111 1.1 mrg
112 1.1 mrg As a small improvement, if R0 is dead after the root (i.e., all uses of
113 1.1 mrg the value with the maximum distance dominate the root), we can avoid
114 1.1 mrg creating RN and use R0 instead of it.
115 1.1 mrg
116 1.1 mrg In our example, we get (only the parts concerning a and b are shown):
117 1.1 mrg for (i = 0; i < 100; i++)
118 1.1 mrg {
119 1.1 mrg f = phi (a[0], s);
120 1.1 mrg s = phi (a[1], f);
121 1.1 mrg x = phi (b[10], x);
122 1.1 mrg
123 1.1 mrg f = f + s;
124 1.1 mrg a[i+2] = f;
125 1.1 mrg x = x + i;
126 1.1 mrg b[10] = x;
127 1.1 mrg }
128 1.1 mrg
129 1.1 mrg 6) Factor F for unrolling is determined as the smallest common multiple of
130 1.1 mrg (N + 1) for each root reference (N for references for that we avoided
131 1.1 mrg creating RN). If F and the loop is small enough, loop is unrolled F
132 1.1 mrg times. The stores to RN (R0) in the copies of the loop body are
133 1.1 mrg periodically replaced with R0, R1, ... (R1, R2, ...), so that they can
134 1.1 mrg be coalesced and the copies can be eliminated.
135 1.1 mrg
136 1.1 mrg TODO -- copy propagation and other optimizations may change the live
137 1.1 mrg ranges of the temporary registers and prevent them from being coalesced;
138 1.1 mrg this may increase the register pressure.
139 1.1 mrg
140 1.1 mrg In our case, F = 2 and the (main loop of the) result is
141 1.1 mrg
142 1.1 mrg for (i = 0; i < ...; i += 2)
143 1.1 mrg {
144 1.1 mrg f = phi (a[0], f);
145 1.1 mrg s = phi (a[1], s);
146 1.1 mrg x = phi (b[10], x);
147 1.1 mrg
148 1.1 mrg f = f + s;
149 1.1 mrg a[i+2] = f;
150 1.1 mrg x = x + i;
151 1.1 mrg b[10] = x;
152 1.1 mrg
153 1.1 mrg s = s + f;
154 1.1 mrg a[i+3] = s;
155 1.1 mrg x = x + i;
156 1.1 mrg b[10] = x;
157 1.1 mrg }
158 1.1 mrg
159 1.1 mrg Apart from predictive commoning on Load-Load and Store-Load chains, we
160 1.1 mrg also support Store-Store chains -- stores killed by other store can be
161 1.1 mrg eliminated. Given below example:
162 1.1 mrg
163 1.1 mrg for (i = 0; i < n; i++)
164 1.1 mrg {
165 1.1 mrg a[i] = 1;
166 1.1 mrg a[i+2] = 2;
167 1.1 mrg }
168 1.1 mrg
169 1.1 mrg It can be replaced with:
170 1.1 mrg
171 1.1 mrg t0 = a[0];
172 1.1 mrg t1 = a[1];
173 1.1 mrg for (i = 0; i < n; i++)
174 1.1 mrg {
175 1.1 mrg a[i] = 1;
176 1.1 mrg t2 = 2;
177 1.1 mrg t0 = t1;
178 1.1 mrg t1 = t2;
179 1.1 mrg }
180 1.1 mrg a[n] = t0;
181 1.1 mrg a[n+1] = t1;
182 1.1 mrg
183 1.1 mrg If the loop runs more than 1 iterations, it can be further simplified into:
184 1.1 mrg
185 1.1 mrg for (i = 0; i < n; i++)
186 1.1 mrg {
187 1.1 mrg a[i] = 1;
188 1.1 mrg }
189 1.1 mrg a[n] = 2;
190 1.1 mrg a[n+1] = 2;
191 1.1 mrg
192 1.1 mrg The interesting part is this can be viewed either as general store motion
193 1.1 mrg or general dead store elimination in either intra/inter-iterations way.
194 1.1 mrg
195 1.1 mrg With trivial effort, we also support load inside Store-Store chains if the
196 1.1 mrg load is dominated by a store statement in the same iteration of loop. You
197 1.1 mrg can see this as a restricted Store-Mixed-Load-Store chain.
198 1.1 mrg
199 1.1 mrg TODO: For now, we don't support store-store chains in multi-exit loops. We
200 1.1 mrg force to not unroll in case of store-store chain even if other chains might
201 1.1 mrg ask for unroll.
202 1.1 mrg
203 1.1 mrg Predictive commoning can be generalized for arbitrary computations (not
204 1.1 mrg just memory loads), and also nontrivial transfer functions (e.g., replacing
205 1.1 mrg i * i with ii_last + 2 * i + 1), to generalize strength reduction. */
206 1.1 mrg
207 1.1 mrg #include "config.h"
208 1.1 mrg #include "system.h"
209 1.1 mrg #include "coretypes.h"
210 1.1 mrg #include "backend.h"
211 1.1 mrg #include "rtl.h"
212 1.1 mrg #include "tree.h"
213 1.1 mrg #include "gimple.h"
214 1.1 mrg #include "predict.h"
215 1.1 mrg #include "tree-pass.h"
216 1.1 mrg #include "ssa.h"
217 1.1 mrg #include "gimple-pretty-print.h"
218 1.1 mrg #include "alias.h"
219 1.1 mrg #include "fold-const.h"
220 1.1 mrg #include "cfgloop.h"
221 1.1 mrg #include "tree-eh.h"
222 1.1 mrg #include "gimplify.h"
223 1.1 mrg #include "gimple-iterator.h"
224 1.1 mrg #include "gimplify-me.h"
225 1.1 mrg #include "tree-ssa-loop-ivopts.h"
226 1.1 mrg #include "tree-ssa-loop-manip.h"
227 1.1 mrg #include "tree-ssa-loop-niter.h"
228 1.1 mrg #include "tree-ssa-loop.h"
229 1.1 mrg #include "tree-into-ssa.h"
230 1.1 mrg #include "tree-dfa.h"
231 1.1 mrg #include "tree-ssa.h"
232 1.1 mrg #include "tree-data-ref.h"
233 1.1 mrg #include "tree-scalar-evolution.h"
234 1.1 mrg #include "tree-affine.h"
235 1.1 mrg #include "builtins.h"
236 1.1 mrg #include "opts.h"
237 1.1 mrg
238 1.1 mrg /* The maximum number of iterations between the considered memory
239 1.1 mrg references. */
240 1.1 mrg
241 1.1 mrg #define MAX_DISTANCE (target_avail_regs < 16 ? 4 : 8)
242 1.1 mrg
243 1.1 mrg /* Data references (or phi nodes that carry data reference values across
244 1.1 mrg loop iterations). */
245 1.1 mrg
246 1.1 mrg typedef class dref_d
247 1.1 mrg {
248 1.1 mrg public:
249 1.1 mrg /* The reference itself. */
250 1.1 mrg struct data_reference *ref;
251 1.1 mrg
252 1.1 mrg /* The statement in that the reference appears. */
253 1.1 mrg gimple *stmt;
254 1.1 mrg
255 1.1 mrg /* In case that STMT is a phi node, this field is set to the SSA name
256 1.1 mrg defined by it in replace_phis_by_defined_names (in order to avoid
257 1.1 mrg pointing to phi node that got reallocated in the meantime). */
258 1.1 mrg tree name_defined_by_phi;
259 1.1 mrg
260 1.1 mrg /* Distance of the reference from the root of the chain (in number of
261 1.1 mrg iterations of the loop). */
262 1.1 mrg unsigned distance;
263 1.1 mrg
264 1.1 mrg /* Number of iterations offset from the first reference in the component. */
265 1.1 mrg widest_int offset;
266 1.1 mrg
267 1.1 mrg /* Number of the reference in a component, in dominance ordering. */
268 1.1 mrg unsigned pos;
269 1.1 mrg
270 1.1 mrg /* True if the memory reference is always accessed when the loop is
271 1.1 mrg entered. */
272 1.1 mrg unsigned always_accessed : 1;
273 1.1 mrg } *dref;
274 1.1 mrg
275 1.1 mrg
276 1.1 mrg /* Type of the chain of the references. */
277 1.1 mrg
278 1.1 mrg enum chain_type
279 1.1 mrg {
280 1.1 mrg /* The addresses of the references in the chain are constant. */
281 1.1 mrg CT_INVARIANT,
282 1.1 mrg
283 1.1 mrg /* There are only loads in the chain. */
284 1.1 mrg CT_LOAD,
285 1.1 mrg
286 1.1 mrg /* Root of the chain is store, the rest are loads. */
287 1.1 mrg CT_STORE_LOAD,
288 1.1 mrg
289 1.1 mrg /* There are only stores in the chain. */
290 1.1 mrg CT_STORE_STORE,
291 1.1 mrg
292 1.1 mrg /* A combination of two chains. */
293 1.1 mrg CT_COMBINATION
294 1.1 mrg };
295 1.1 mrg
296 1.1 mrg /* Chains of data references. */
297 1.1 mrg
298 1.1 mrg typedef struct chain
299 1.1 mrg {
300 1.1 mrg chain (chain_type t) : type (t), op (ERROR_MARK), rslt_type (NULL_TREE),
301 1.1 mrg ch1 (NULL), ch2 (NULL), length (0), init_seq (NULL), fini_seq (NULL),
302 1.1 mrg has_max_use_after (false), all_always_accessed (false), combined (false),
303 1.1 mrg inv_store_elimination (false) {}
304 1.1 mrg
305 1.1 mrg /* Type of the chain. */
306 1.1 mrg enum chain_type type;
307 1.1 mrg
308 1.1 mrg /* For combination chains, the operator and the two chains that are
309 1.1 mrg combined, and the type of the result. */
310 1.1 mrg enum tree_code op;
311 1.1 mrg tree rslt_type;
312 1.1 mrg struct chain *ch1, *ch2;
313 1.1 mrg
314 1.1 mrg /* The references in the chain. */
315 1.1 mrg auto_vec<dref> refs;
316 1.1 mrg
317 1.1 mrg /* The maximum distance of the reference in the chain from the root. */
318 1.1 mrg unsigned length;
319 1.1 mrg
320 1.1 mrg /* The variables used to copy the value throughout iterations. */
321 1.1 mrg auto_vec<tree> vars;
322 1.1 mrg
323 1.1 mrg /* Initializers for the variables. */
324 1.1 mrg auto_vec<tree> inits;
325 1.1 mrg
326 1.1 mrg /* Finalizers for the eliminated stores. */
327 1.1 mrg auto_vec<tree> finis;
328 1.1 mrg
329 1.1 mrg /* gimple stmts intializing the initial variables of the chain. */
330 1.1 mrg gimple_seq init_seq;
331 1.1 mrg
332 1.1 mrg /* gimple stmts finalizing the eliminated stores of the chain. */
333 1.1 mrg gimple_seq fini_seq;
334 1.1 mrg
335 1.1 mrg /* True if there is a use of a variable with the maximal distance
336 1.1 mrg that comes after the root in the loop. */
337 1.1 mrg unsigned has_max_use_after : 1;
338 1.1 mrg
339 1.1 mrg /* True if all the memory references in the chain are always accessed. */
340 1.1 mrg unsigned all_always_accessed : 1;
341 1.1 mrg
342 1.1 mrg /* True if this chain was combined together with some other chain. */
343 1.1 mrg unsigned combined : 1;
344 1.1 mrg
345 1.1 mrg /* True if this is store elimination chain and eliminated stores store
346 1.1 mrg loop invariant value into memory. */
347 1.1 mrg unsigned inv_store_elimination : 1;
348 1.1 mrg } *chain_p;
349 1.1 mrg
350 1.1 mrg
351 1.1 mrg /* Describes the knowledge about the step of the memory references in
352 1.1 mrg the component. */
353 1.1 mrg
354 1.1 mrg enum ref_step_type
355 1.1 mrg {
356 1.1 mrg /* The step is zero. */
357 1.1 mrg RS_INVARIANT,
358 1.1 mrg
359 1.1 mrg /* The step is nonzero. */
360 1.1 mrg RS_NONZERO,
361 1.1 mrg
362 1.1 mrg /* The step may or may not be nonzero. */
363 1.1 mrg RS_ANY
364 1.1 mrg };
365 1.1 mrg
366 1.1 mrg /* Components of the data dependence graph. */
367 1.1 mrg
368 1.1 mrg struct component
369 1.1 mrg {
370 1.1 mrg component (bool es) : comp_step (RS_ANY), eliminate_store_p (es),
371 1.1 mrg next (NULL) {}
372 1.1 mrg
373 1.1 mrg /* The references in the component. */
374 1.1 mrg auto_vec<dref> refs;
375 1.1 mrg
376 1.1 mrg /* What we know about the step of the references in the component. */
377 1.1 mrg enum ref_step_type comp_step;
378 1.1 mrg
379 1.1 mrg /* True if all references in component are stores and we try to do
380 1.1 mrg intra/inter loop iteration dead store elimination. */
381 1.1 mrg bool eliminate_store_p;
382 1.1 mrg
383 1.1 mrg /* Next component in the list. */
384 1.1 mrg struct component *next;
385 1.1 mrg };
386 1.1 mrg
387 1.1 mrg /* A class to encapsulate the global states used for predictive
388 1.1 mrg commoning work on top of one given LOOP. */
389 1.1 mrg
390 1.1 mrg class pcom_worker
391 1.1 mrg {
392 1.1 mrg public:
393 1.1 mrg pcom_worker (loop_p l) : m_loop (l), m_cache (NULL) {}
394 1.1 mrg
395 1.1 mrg ~pcom_worker ()
396 1.1 mrg {
397 1.1 mrg free_data_refs (m_datarefs);
398 1.1 mrg free_dependence_relations (m_dependences);
399 1.1 mrg free_affine_expand_cache (&m_cache);
400 1.1 mrg release_chains ();
401 1.1 mrg }
402 1.1 mrg
403 1.1 mrg pcom_worker (const pcom_worker &) = delete;
404 1.1 mrg pcom_worker &operator= (const pcom_worker &) = delete;
405 1.1 mrg
406 1.1 mrg /* Performs predictive commoning. */
407 1.1 mrg unsigned tree_predictive_commoning_loop (bool allow_unroll_p);
408 1.1 mrg
409 1.1 mrg /* Perform the predictive commoning optimization for chains, make this
410 1.1 mrg public for being called in callback execute_pred_commoning_cbck. */
411 1.1 mrg void execute_pred_commoning (bitmap tmp_vars);
412 1.1 mrg
413 1.1 mrg private:
414 1.1 mrg /* The pointer to the given loop. */
415 1.1 mrg loop_p m_loop;
416 1.1 mrg
417 1.1 mrg /* All data references. */
418 1.1 mrg auto_vec<data_reference_p, 10> m_datarefs;
419 1.1 mrg
420 1.1 mrg /* All data dependences. */
421 1.1 mrg auto_vec<ddr_p, 10> m_dependences;
422 1.1 mrg
423 1.1 mrg /* All chains. */
424 1.1 mrg auto_vec<chain_p> m_chains;
425 1.1 mrg
426 1.1 mrg /* Bitmap of ssa names defined by looparound phi nodes covered by chains. */
427 1.1 mrg auto_bitmap m_looparound_phis;
428 1.1 mrg
429 1.1 mrg typedef hash_map<tree, name_expansion *> tree_expand_map_t;
430 1.1 mrg /* Cache used by tree_to_aff_combination_expand. */
431 1.1 mrg tree_expand_map_t *m_cache;
432 1.1 mrg
433 1.1 mrg /* Splits dependence graph to components. */
434 1.1 mrg struct component *split_data_refs_to_components ();
435 1.1 mrg
436 1.1 mrg /* Check the conditions on references inside each of components COMPS,
437 1.1 mrg and remove the unsuitable components from the list. */
438 1.1 mrg struct component *filter_suitable_components (struct component *comps);
439 1.1 mrg
440 1.1 mrg /* Find roots of the values and determine distances in components COMPS,
441 1.1 mrg and separates the references to chains. */
442 1.1 mrg void determine_roots (struct component *comps);
443 1.1 mrg
444 1.1 mrg /* Prepare initializers for chains, and free chains that cannot
445 1.1 mrg be used because the initializers might trap. */
446 1.1 mrg void prepare_initializers ();
447 1.1 mrg
448 1.1 mrg /* Generates finalizer memory reference for chains. Returns true if
449 1.1 mrg finalizer code generation for chains breaks loop closed ssa form. */
450 1.1 mrg bool prepare_finalizers ();
451 1.1 mrg
452 1.1 mrg /* Try to combine the chains. */
453 1.1 mrg void try_combine_chains ();
454 1.1 mrg
455 1.1 mrg /* Frees CHAINS. */
456 1.1 mrg void release_chains ();
457 1.1 mrg
458 1.1 mrg /* Frees a chain CHAIN. */
459 1.1 mrg void release_chain (chain_p chain);
460 1.1 mrg
461 1.1 mrg /* Prepare initializers for CHAIN. Returns false if this is impossible
462 1.1 mrg because one of these initializers may trap, true otherwise. */
463 1.1 mrg bool prepare_initializers_chain (chain_p chain);
464 1.1 mrg
465 1.1 mrg /* Generates finalizer memory references for CHAIN. Returns true
466 1.1 mrg if finalizer code for CHAIN can be generated, otherwise false. */
467 1.1 mrg bool prepare_finalizers_chain (chain_p chain);
468 1.1 mrg
469 1.1 mrg /* Stores DR_OFFSET (DR) + DR_INIT (DR) to OFFSET. */
470 1.1 mrg void aff_combination_dr_offset (struct data_reference *dr, aff_tree *offset);
471 1.1 mrg
472 1.1 mrg /* Determines number of iterations of the innermost enclosing loop before
473 1.1 mrg B refers to exactly the same location as A and stores it to OFF. */
474 1.1 mrg bool determine_offset (struct data_reference *a, struct data_reference *b,
475 1.1 mrg poly_widest_int *off);
476 1.1 mrg
477 1.1 mrg /* Returns true if the component COMP satisfies the conditions
478 1.1 mrg described in 2) at the beginning of this file. */
479 1.1 mrg bool suitable_component_p (struct component *comp);
480 1.1 mrg
481 1.1 mrg /* Returns true if REF is a valid initializer for ROOT with given
482 1.1 mrg DISTANCE (in iterations of the innermost enclosing loop). */
483 1.1 mrg bool valid_initializer_p (struct data_reference *ref, unsigned distance,
484 1.1 mrg struct data_reference *root);
485 1.1 mrg
486 1.1 mrg /* Finds looparound phi node of loop that copies the value of REF. */
487 1.1 mrg gphi *find_looparound_phi (dref ref, dref root);
488 1.1 mrg
489 1.1 mrg /* Find roots of the values and determine distances in the component
490 1.1 mrg COMP. The references are redistributed into chains. */
491 1.1 mrg void determine_roots_comp (struct component *comp);
492 1.1 mrg
493 1.1 mrg /* For references in CHAIN that are copied around the loop, add the
494 1.1 mrg results of such copies to the chain. */
495 1.1 mrg void add_looparound_copies (chain_p chain);
496 1.1 mrg
497 1.1 mrg /* Returns the single statement in that NAME is used, excepting
498 1.1 mrg the looparound phi nodes contained in one of the chains. */
499 1.1 mrg gimple *single_nonlooparound_use (tree name);
500 1.1 mrg
501 1.1 mrg /* Remove statement STMT, as well as the chain of assignments in that
502 1.1 mrg it is used. */
503 1.1 mrg void remove_stmt (gimple *stmt);
504 1.1 mrg
505 1.1 mrg /* Perform the predictive commoning optimization for a chain CHAIN. */
506 1.1 mrg void execute_pred_commoning_chain (chain_p chain, bitmap tmp_vars);
507 1.1 mrg
508 1.1 mrg /* Returns the modify statement that uses NAME. */
509 1.1 mrg gimple *find_use_stmt (tree *name);
510 1.1 mrg
511 1.1 mrg /* If the operation used in STMT is associative and commutative, go
512 1.1 mrg through the tree of the same operations and returns its root. */
513 1.1 mrg gimple *find_associative_operation_root (gimple *stmt, unsigned *distance);
514 1.1 mrg
515 1.1 mrg /* Returns the common statement in that NAME1 and NAME2 have a use. */
516 1.1 mrg gimple *find_common_use_stmt (tree *name1, tree *name2);
517 1.1 mrg
518 1.1 mrg /* Checks whether R1 and R2 are combined together using CODE, with the
519 1.1 mrg result in RSLT_TYPE, in order R1 CODE R2 if SWAP is false and in order
520 1.1 mrg R2 CODE R1 if it is true. */
521 1.1 mrg bool combinable_refs_p (dref r1, dref r2, enum tree_code *code, bool *swap,
522 1.1 mrg tree *rslt_type);
523 1.1 mrg
524 1.1 mrg /* Reassociates the expression in that NAME1 and NAME2 are used so that
525 1.1 mrg they are combined in a single statement, and returns this statement. */
526 1.1 mrg gimple *reassociate_to_the_same_stmt (tree name1, tree name2);
527 1.1 mrg
528 1.1 mrg /* Returns the statement that combines references R1 and R2. */
529 1.1 mrg gimple *stmt_combining_refs (dref r1, dref r2);
530 1.1 mrg
531 1.1 mrg /* Tries to combine chains CH1 and CH2 together. */
532 1.1 mrg chain_p combine_chains (chain_p ch1, chain_p ch2);
533 1.1 mrg };
534 1.1 mrg
535 1.1 mrg /* Dumps data reference REF to FILE. */
536 1.1 mrg
537 1.1 mrg extern void dump_dref (FILE *, dref);
538 1.1 mrg void
539 1.1 mrg dump_dref (FILE *file, dref ref)
540 1.1 mrg {
541 1.1 mrg if (ref->ref)
542 1.1 mrg {
543 1.1 mrg fprintf (file, " ");
544 1.1 mrg print_generic_expr (file, DR_REF (ref->ref), TDF_SLIM);
545 1.1 mrg fprintf (file, " (id %u%s)\n", ref->pos,
546 1.1 mrg DR_IS_READ (ref->ref) ? "" : ", write");
547 1.1 mrg
548 1.1 mrg fprintf (file, " offset ");
549 1.1 mrg print_decs (ref->offset, file);
550 1.1 mrg fprintf (file, "\n");
551 1.1 mrg
552 1.1 mrg fprintf (file, " distance %u\n", ref->distance);
553 1.1 mrg }
554 1.1 mrg else
555 1.1 mrg {
556 1.1 mrg if (gimple_code (ref->stmt) == GIMPLE_PHI)
557 1.1 mrg fprintf (file, " looparound ref\n");
558 1.1 mrg else
559 1.1 mrg fprintf (file, " combination ref\n");
560 1.1 mrg fprintf (file, " in statement ");
561 1.1 mrg print_gimple_stmt (file, ref->stmt, 0, TDF_SLIM);
562 1.1 mrg fprintf (file, "\n");
563 1.1 mrg fprintf (file, " distance %u\n", ref->distance);
564 1.1 mrg }
565 1.1 mrg
566 1.1 mrg }
567 1.1 mrg
568 1.1 mrg /* Dumps CHAIN to FILE. */
569 1.1 mrg
570 1.1 mrg extern void dump_chain (FILE *, chain_p);
571 1.1 mrg void
572 1.1 mrg dump_chain (FILE *file, chain_p chain)
573 1.1 mrg {
574 1.1 mrg dref a;
575 1.1 mrg const char *chain_type;
576 1.1 mrg unsigned i;
577 1.1 mrg tree var;
578 1.1 mrg
579 1.1 mrg switch (chain->type)
580 1.1 mrg {
581 1.1 mrg case CT_INVARIANT:
582 1.1 mrg chain_type = "Load motion";
583 1.1 mrg break;
584 1.1 mrg
585 1.1 mrg case CT_LOAD:
586 1.1 mrg chain_type = "Loads-only";
587 1.1 mrg break;
588 1.1 mrg
589 1.1 mrg case CT_STORE_LOAD:
590 1.1 mrg chain_type = "Store-loads";
591 1.1 mrg break;
592 1.1 mrg
593 1.1 mrg case CT_STORE_STORE:
594 1.1 mrg chain_type = "Store-stores";
595 1.1 mrg break;
596 1.1 mrg
597 1.1 mrg case CT_COMBINATION:
598 1.1 mrg chain_type = "Combination";
599 1.1 mrg break;
600 1.1 mrg
601 1.1 mrg default:
602 1.1 mrg gcc_unreachable ();
603 1.1 mrg }
604 1.1 mrg
605 1.1 mrg fprintf (file, "%s chain %p%s\n", chain_type, (void *) chain,
606 1.1 mrg chain->combined ? " (combined)" : "");
607 1.1 mrg if (chain->type != CT_INVARIANT)
608 1.1 mrg fprintf (file, " max distance %u%s\n", chain->length,
609 1.1 mrg chain->has_max_use_after ? "" : ", may reuse first");
610 1.1 mrg
611 1.1 mrg if (chain->type == CT_COMBINATION)
612 1.1 mrg {
613 1.1 mrg fprintf (file, " equal to %p %s %p in type ",
614 1.1 mrg (void *) chain->ch1, op_symbol_code (chain->op),
615 1.1 mrg (void *) chain->ch2);
616 1.1 mrg print_generic_expr (file, chain->rslt_type, TDF_SLIM);
617 1.1 mrg fprintf (file, "\n");
618 1.1 mrg }
619 1.1 mrg
620 1.1 mrg if (chain->vars.exists ())
621 1.1 mrg {
622 1.1 mrg fprintf (file, " vars");
623 1.1 mrg FOR_EACH_VEC_ELT (chain->vars, i, var)
624 1.1 mrg {
625 1.1 mrg fprintf (file, " ");
626 1.1 mrg print_generic_expr (file, var, TDF_SLIM);
627 1.1 mrg }
628 1.1 mrg fprintf (file, "\n");
629 1.1 mrg }
630 1.1 mrg
631 1.1 mrg if (chain->inits.exists ())
632 1.1 mrg {
633 1.1 mrg fprintf (file, " inits");
634 1.1 mrg FOR_EACH_VEC_ELT (chain->inits, i, var)
635 1.1 mrg {
636 1.1 mrg fprintf (file, " ");
637 1.1 mrg print_generic_expr (file, var, TDF_SLIM);
638 1.1 mrg }
639 1.1 mrg fprintf (file, "\n");
640 1.1 mrg }
641 1.1 mrg
642 1.1 mrg fprintf (file, " references:\n");
643 1.1 mrg FOR_EACH_VEC_ELT (chain->refs, i, a)
644 1.1 mrg dump_dref (file, a);
645 1.1 mrg
646 1.1 mrg fprintf (file, "\n");
647 1.1 mrg }
648 1.1 mrg
649 1.1 mrg /* Dumps CHAINS to FILE. */
650 1.1 mrg
651 1.1 mrg void
652 1.1 mrg dump_chains (FILE *file, const vec<chain_p> &chains)
653 1.1 mrg {
654 1.1 mrg chain_p chain;
655 1.1 mrg unsigned i;
656 1.1 mrg
657 1.1 mrg FOR_EACH_VEC_ELT (chains, i, chain)
658 1.1 mrg dump_chain (file, chain);
659 1.1 mrg }
660 1.1 mrg
661 1.1 mrg /* Dumps COMP to FILE. */
662 1.1 mrg
663 1.1 mrg extern void dump_component (FILE *, struct component *);
664 1.1 mrg void
665 1.1 mrg dump_component (FILE *file, struct component *comp)
666 1.1 mrg {
667 1.1 mrg dref a;
668 1.1 mrg unsigned i;
669 1.1 mrg
670 1.1 mrg fprintf (file, "Component%s:\n",
671 1.1 mrg comp->comp_step == RS_INVARIANT ? " (invariant)" : "");
672 1.1 mrg FOR_EACH_VEC_ELT (comp->refs, i, a)
673 1.1 mrg dump_dref (file, a);
674 1.1 mrg fprintf (file, "\n");
675 1.1 mrg }
676 1.1 mrg
677 1.1 mrg /* Dumps COMPS to FILE. */
678 1.1 mrg
679 1.1 mrg extern void dump_components (FILE *, struct component *);
680 1.1 mrg void
681 1.1 mrg dump_components (FILE *file, struct component *comps)
682 1.1 mrg {
683 1.1 mrg struct component *comp;
684 1.1 mrg
685 1.1 mrg for (comp = comps; comp; comp = comp->next)
686 1.1 mrg dump_component (file, comp);
687 1.1 mrg }
688 1.1 mrg
689 1.1 mrg /* Frees a chain CHAIN. */
690 1.1 mrg
691 1.1 mrg void
692 1.1 mrg pcom_worker::release_chain (chain_p chain)
693 1.1 mrg {
694 1.1 mrg dref ref;
695 1.1 mrg unsigned i;
696 1.1 mrg
697 1.1 mrg if (chain == NULL)
698 1.1 mrg return;
699 1.1 mrg
700 1.1 mrg FOR_EACH_VEC_ELT (chain->refs, i, ref)
701 1.1 mrg free (ref);
702 1.1 mrg
703 1.1 mrg if (chain->init_seq)
704 1.1 mrg gimple_seq_discard (chain->init_seq);
705 1.1 mrg
706 1.1 mrg if (chain->fini_seq)
707 1.1 mrg gimple_seq_discard (chain->fini_seq);
708 1.1 mrg
709 1.1 mrg delete chain;
710 1.1 mrg }
711 1.1 mrg
712 1.1 mrg /* Frees CHAINS. */
713 1.1 mrg
714 1.1 mrg void
715 1.1 mrg pcom_worker::release_chains ()
716 1.1 mrg {
717 1.1 mrg unsigned i;
718 1.1 mrg chain_p chain;
719 1.1 mrg
720 1.1 mrg FOR_EACH_VEC_ELT (m_chains, i, chain)
721 1.1 mrg release_chain (chain);
722 1.1 mrg }
723 1.1 mrg
724 1.1 mrg /* Frees list of components COMPS. */
725 1.1 mrg
726 1.1 mrg static void
727 1.1 mrg release_components (struct component *comps)
728 1.1 mrg {
729 1.1 mrg struct component *act, *next;
730 1.1 mrg
731 1.1 mrg for (act = comps; act; act = next)
732 1.1 mrg {
733 1.1 mrg next = act->next;
734 1.1 mrg delete act;
735 1.1 mrg }
736 1.1 mrg }
737 1.1 mrg
738 1.1 mrg /* Finds a root of tree given by FATHERS containing A, and performs path
739 1.1 mrg shortening. */
740 1.1 mrg
741 1.1 mrg static unsigned
742 1.1 mrg component_of (vec<unsigned> &fathers, unsigned a)
743 1.1 mrg {
744 1.1 mrg unsigned root, n;
745 1.1 mrg
746 1.1 mrg for (root = a; root != fathers[root]; root = fathers[root])
747 1.1 mrg continue;
748 1.1 mrg
749 1.1 mrg for (; a != root; a = n)
750 1.1 mrg {
751 1.1 mrg n = fathers[a];
752 1.1 mrg fathers[a] = root;
753 1.1 mrg }
754 1.1 mrg
755 1.1 mrg return root;
756 1.1 mrg }
757 1.1 mrg
758 1.1 mrg /* Join operation for DFU. FATHERS gives the tree, SIZES are sizes of the
759 1.1 mrg components, A and B are components to merge. */
760 1.1 mrg
761 1.1 mrg static void
762 1.1 mrg merge_comps (vec<unsigned> &fathers, vec<unsigned> &sizes,
763 1.1 mrg unsigned a, unsigned b)
764 1.1 mrg {
765 1.1 mrg unsigned ca = component_of (fathers, a);
766 1.1 mrg unsigned cb = component_of (fathers, b);
767 1.1 mrg
768 1.1 mrg if (ca == cb)
769 1.1 mrg return;
770 1.1 mrg
771 1.1 mrg if (sizes[ca] < sizes[cb])
772 1.1 mrg {
773 1.1 mrg sizes[cb] += sizes[ca];
774 1.1 mrg fathers[ca] = cb;
775 1.1 mrg }
776 1.1 mrg else
777 1.1 mrg {
778 1.1 mrg sizes[ca] += sizes[cb];
779 1.1 mrg fathers[cb] = ca;
780 1.1 mrg }
781 1.1 mrg }
782 1.1 mrg
783 1.1 mrg /* Returns true if A is a reference that is suitable for predictive commoning
784 1.1 mrg in the innermost loop that contains it. REF_STEP is set according to the
785 1.1 mrg step of the reference A. */
786 1.1 mrg
787 1.1 mrg static bool
788 1.1 mrg suitable_reference_p (struct data_reference *a, enum ref_step_type *ref_step)
789 1.1 mrg {
790 1.1 mrg tree ref = DR_REF (a), step = DR_STEP (a);
791 1.1 mrg
792 1.1 mrg if (!step
793 1.1 mrg || TREE_THIS_VOLATILE (ref)
794 1.1 mrg || !is_gimple_reg_type (TREE_TYPE (ref))
795 1.1 mrg || tree_could_throw_p (ref))
796 1.1 mrg return false;
797 1.1 mrg
798 1.1 mrg if (integer_zerop (step))
799 1.1 mrg *ref_step = RS_INVARIANT;
800 1.1 mrg else if (integer_nonzerop (step))
801 1.1 mrg *ref_step = RS_NONZERO;
802 1.1 mrg else
803 1.1 mrg *ref_step = RS_ANY;
804 1.1 mrg
805 1.1 mrg return true;
806 1.1 mrg }
807 1.1 mrg
808 1.1 mrg /* Stores DR_OFFSET (DR) + DR_INIT (DR) to OFFSET. */
809 1.1 mrg
810 1.1 mrg void
811 1.1 mrg pcom_worker::aff_combination_dr_offset (struct data_reference *dr,
812 1.1 mrg aff_tree *offset)
813 1.1 mrg {
814 1.1 mrg tree type = TREE_TYPE (DR_OFFSET (dr));
815 1.1 mrg aff_tree delta;
816 1.1 mrg
817 1.1 mrg tree_to_aff_combination_expand (DR_OFFSET (dr), type, offset, &m_cache);
818 1.1 mrg aff_combination_const (&delta, type, wi::to_poly_widest (DR_INIT (dr)));
819 1.1 mrg aff_combination_add (offset, &delta);
820 1.1 mrg }
821 1.1 mrg
822 1.1 mrg /* Determines number of iterations of the innermost enclosing loop before B
823 1.1 mrg refers to exactly the same location as A and stores it to OFF. If A and
824 1.1 mrg B do not have the same step, they never meet, or anything else fails,
825 1.1 mrg returns false, otherwise returns true. Both A and B are assumed to
826 1.1 mrg satisfy suitable_reference_p. */
827 1.1 mrg
828 1.1 mrg bool
829 1.1 mrg pcom_worker::determine_offset (struct data_reference *a,
830 1.1 mrg struct data_reference *b, poly_widest_int *off)
831 1.1 mrg {
832 1.1 mrg aff_tree diff, baseb, step;
833 1.1 mrg tree typea, typeb;
834 1.1 mrg
835 1.1 mrg /* Check that both the references access the location in the same type. */
836 1.1 mrg typea = TREE_TYPE (DR_REF (a));
837 1.1 mrg typeb = TREE_TYPE (DR_REF (b));
838 1.1 mrg if (!useless_type_conversion_p (typeb, typea))
839 1.1 mrg return false;
840 1.1 mrg
841 1.1 mrg /* Check whether the base address and the step of both references is the
842 1.1 mrg same. */
843 1.1 mrg if (!operand_equal_p (DR_STEP (a), DR_STEP (b), 0)
844 1.1 mrg || !operand_equal_p (DR_BASE_ADDRESS (a), DR_BASE_ADDRESS (b), 0))
845 1.1 mrg return false;
846 1.1 mrg
847 1.1 mrg if (integer_zerop (DR_STEP (a)))
848 1.1 mrg {
849 1.1 mrg /* If the references have loop invariant address, check that they access
850 1.1 mrg exactly the same location. */
851 1.1 mrg *off = 0;
852 1.1 mrg return (operand_equal_p (DR_OFFSET (a), DR_OFFSET (b), 0)
853 1.1 mrg && operand_equal_p (DR_INIT (a), DR_INIT (b), 0));
854 1.1 mrg }
855 1.1 mrg
856 1.1 mrg /* Compare the offsets of the addresses, and check whether the difference
857 1.1 mrg is a multiple of step. */
858 1.1 mrg aff_combination_dr_offset (a, &diff);
859 1.1 mrg aff_combination_dr_offset (b, &baseb);
860 1.1 mrg aff_combination_scale (&baseb, -1);
861 1.1 mrg aff_combination_add (&diff, &baseb);
862 1.1 mrg
863 1.1 mrg tree_to_aff_combination_expand (DR_STEP (a), TREE_TYPE (DR_STEP (a)),
864 1.1 mrg &step, &m_cache);
865 1.1 mrg return aff_combination_constant_multiple_p (&diff, &step, off);
866 1.1 mrg }
867 1.1 mrg
868 1.1 mrg /* Returns the last basic block in LOOP for that we are sure that
869 1.1 mrg it is executed whenever the loop is entered. */
870 1.1 mrg
871 1.1 mrg static basic_block
872 1.1 mrg last_always_executed_block (class loop *loop)
873 1.1 mrg {
874 1.1 mrg unsigned i;
875 1.1 mrg auto_vec<edge> exits = get_loop_exit_edges (loop);
876 1.1 mrg edge ex;
877 1.1 mrg basic_block last = loop->latch;
878 1.1 mrg
879 1.1 mrg FOR_EACH_VEC_ELT (exits, i, ex)
880 1.1 mrg last = nearest_common_dominator (CDI_DOMINATORS, last, ex->src);
881 1.1 mrg
882 1.1 mrg return last;
883 1.1 mrg }
884 1.1 mrg
885 1.1 mrg /* Splits dependence graph on DATAREFS described by DEPENDENCES to
886 1.1 mrg components. */
887 1.1 mrg
888 1.1 mrg struct component *
889 1.1 mrg pcom_worker::split_data_refs_to_components ()
890 1.1 mrg {
891 1.1 mrg unsigned i, n = m_datarefs.length ();
892 1.1 mrg unsigned ca, ia, ib, bad;
893 1.1 mrg struct data_reference *dr, *dra, *drb;
894 1.1 mrg struct data_dependence_relation *ddr;
895 1.1 mrg struct component *comp_list = NULL, *comp;
896 1.1 mrg dref dataref;
897 1.1 mrg /* Don't do store elimination if loop has multiple exit edges. */
898 1.1 mrg bool eliminate_store_p = single_exit (m_loop) != NULL;
899 1.1 mrg basic_block last_always_executed = last_always_executed_block (m_loop);
900 1.1 mrg auto_bitmap no_store_store_comps;
901 1.1 mrg auto_vec<unsigned> comp_father (n + 1);
902 1.1 mrg auto_vec<unsigned> comp_size (n + 1);
903 1.1 mrg comp_father.quick_grow (n + 1);
904 1.1 mrg comp_size.quick_grow (n + 1);
905 1.1 mrg
906 1.1 mrg FOR_EACH_VEC_ELT (m_datarefs, i, dr)
907 1.1 mrg {
908 1.1 mrg if (!DR_REF (dr))
909 1.1 mrg /* A fake reference for call or asm_expr that may clobber memory;
910 1.1 mrg just fail. */
911 1.1 mrg return NULL;
912 1.1 mrg /* predcom pass isn't prepared to handle calls with data references. */
913 1.1 mrg if (is_gimple_call (DR_STMT (dr)))
914 1.1 mrg return NULL;
915 1.1 mrg dr->aux = (void *) (size_t) i;
916 1.1 mrg comp_father[i] = i;
917 1.1 mrg comp_size[i] = 1;
918 1.1 mrg }
919 1.1 mrg
920 1.1 mrg /* A component reserved for the "bad" data references. */
921 1.1 mrg comp_father[n] = n;
922 1.1 mrg comp_size[n] = 1;
923 1.1 mrg
924 1.1 mrg FOR_EACH_VEC_ELT (m_datarefs, i, dr)
925 1.1 mrg {
926 1.1 mrg enum ref_step_type dummy;
927 1.1 mrg
928 1.1 mrg if (!suitable_reference_p (dr, &dummy))
929 1.1 mrg {
930 1.1 mrg ia = (unsigned) (size_t) dr->aux;
931 1.1 mrg merge_comps (comp_father, comp_size, n, ia);
932 1.1 mrg }
933 1.1 mrg }
934 1.1 mrg
935 1.1 mrg FOR_EACH_VEC_ELT (m_dependences, i, ddr)
936 1.1 mrg {
937 1.1 mrg poly_widest_int dummy_off;
938 1.1 mrg
939 1.1 mrg if (DDR_ARE_DEPENDENT (ddr) == chrec_known)
940 1.1 mrg continue;
941 1.1 mrg
942 1.1 mrg dra = DDR_A (ddr);
943 1.1 mrg drb = DDR_B (ddr);
944 1.1 mrg
945 1.1 mrg /* Don't do store elimination if there is any unknown dependence for
946 1.1 mrg any store data reference. */
947 1.1 mrg if ((DR_IS_WRITE (dra) || DR_IS_WRITE (drb))
948 1.1 mrg && (DDR_ARE_DEPENDENT (ddr) == chrec_dont_know
949 1.1 mrg || DDR_NUM_DIST_VECTS (ddr) == 0))
950 1.1 mrg eliminate_store_p = false;
951 1.1 mrg
952 1.1 mrg ia = component_of (comp_father, (unsigned) (size_t) dra->aux);
953 1.1 mrg ib = component_of (comp_father, (unsigned) (size_t) drb->aux);
954 1.1 mrg if (ia == ib)
955 1.1 mrg continue;
956 1.1 mrg
957 1.1 mrg bad = component_of (comp_father, n);
958 1.1 mrg
959 1.1 mrg /* If both A and B are reads, we may ignore unsuitable dependences. */
960 1.1 mrg if (DR_IS_READ (dra) && DR_IS_READ (drb))
961 1.1 mrg {
962 1.1 mrg if (ia == bad || ib == bad
963 1.1 mrg || !determine_offset (dra, drb, &dummy_off))
964 1.1 mrg continue;
965 1.1 mrg }
966 1.1 mrg /* If A is read and B write or vice versa and there is unsuitable
967 1.1 mrg dependence, instead of merging both components into a component
968 1.1 mrg that will certainly not pass suitable_component_p, just put the
969 1.1 mrg read into bad component, perhaps at least the write together with
970 1.1 mrg all the other data refs in it's component will be optimizable. */
971 1.1 mrg else if (DR_IS_READ (dra) && ib != bad)
972 1.1 mrg {
973 1.1 mrg if (ia == bad)
974 1.1 mrg {
975 1.1 mrg bitmap_set_bit (no_store_store_comps, ib);
976 1.1 mrg continue;
977 1.1 mrg }
978 1.1 mrg else if (!determine_offset (dra, drb, &dummy_off))
979 1.1 mrg {
980 1.1 mrg bitmap_set_bit (no_store_store_comps, ib);
981 1.1 mrg merge_comps (comp_father, comp_size, bad, ia);
982 1.1 mrg continue;
983 1.1 mrg }
984 1.1 mrg }
985 1.1 mrg else if (DR_IS_READ (drb) && ia != bad)
986 1.1 mrg {
987 1.1 mrg if (ib == bad)
988 1.1 mrg {
989 1.1 mrg bitmap_set_bit (no_store_store_comps, ia);
990 1.1 mrg continue;
991 1.1 mrg }
992 1.1 mrg else if (!determine_offset (dra, drb, &dummy_off))
993 1.1 mrg {
994 1.1 mrg bitmap_set_bit (no_store_store_comps, ia);
995 1.1 mrg merge_comps (comp_father, comp_size, bad, ib);
996 1.1 mrg continue;
997 1.1 mrg }
998 1.1 mrg }
999 1.1 mrg else if (DR_IS_WRITE (dra) && DR_IS_WRITE (drb)
1000 1.1 mrg && ia != bad && ib != bad
1001 1.1 mrg && !determine_offset (dra, drb, &dummy_off))
1002 1.1 mrg {
1003 1.1 mrg merge_comps (comp_father, comp_size, bad, ia);
1004 1.1 mrg merge_comps (comp_father, comp_size, bad, ib);
1005 1.1 mrg continue;
1006 1.1 mrg }
1007 1.1 mrg
1008 1.1 mrg merge_comps (comp_father, comp_size, ia, ib);
1009 1.1 mrg }
1010 1.1 mrg
1011 1.1 mrg if (eliminate_store_p)
1012 1.1 mrg {
1013 1.1 mrg tree niters = number_of_latch_executions (m_loop);
1014 1.1 mrg
1015 1.1 mrg /* Don't do store elimination if niters info is unknown because stores
1016 1.1 mrg in the last iteration can't be eliminated and we need to recover it
1017 1.1 mrg after loop. */
1018 1.1 mrg eliminate_store_p = (niters != NULL_TREE && niters != chrec_dont_know);
1019 1.1 mrg }
1020 1.1 mrg
1021 1.1 mrg auto_vec<struct component *> comps;
1022 1.1 mrg comps.safe_grow_cleared (n, true);
1023 1.1 mrg bad = component_of (comp_father, n);
1024 1.1 mrg FOR_EACH_VEC_ELT (m_datarefs, i, dr)
1025 1.1 mrg {
1026 1.1 mrg ia = (unsigned) (size_t) dr->aux;
1027 1.1 mrg ca = component_of (comp_father, ia);
1028 1.1 mrg if (ca == bad)
1029 1.1 mrg continue;
1030 1.1 mrg
1031 1.1 mrg comp = comps[ca];
1032 1.1 mrg if (!comp)
1033 1.1 mrg {
1034 1.1 mrg comp = new component (eliminate_store_p);
1035 1.1 mrg comp->refs.reserve_exact (comp_size[ca]);
1036 1.1 mrg comps[ca] = comp;
1037 1.1 mrg }
1038 1.1 mrg
1039 1.1 mrg dataref = XCNEW (class dref_d);
1040 1.1 mrg dataref->ref = dr;
1041 1.1 mrg dataref->stmt = DR_STMT (dr);
1042 1.1 mrg dataref->offset = 0;
1043 1.1 mrg dataref->distance = 0;
1044 1.1 mrg
1045 1.1 mrg dataref->always_accessed
1046 1.1 mrg = dominated_by_p (CDI_DOMINATORS, last_always_executed,
1047 1.1 mrg gimple_bb (dataref->stmt));
1048 1.1 mrg dataref->pos = comp->refs.length ();
1049 1.1 mrg comp->refs.quick_push (dataref);
1050 1.1 mrg }
1051 1.1 mrg
1052 1.1 mrg if (eliminate_store_p)
1053 1.1 mrg {
1054 1.1 mrg bitmap_iterator bi;
1055 1.1 mrg EXECUTE_IF_SET_IN_BITMAP (no_store_store_comps, 0, ia, bi)
1056 1.1 mrg {
1057 1.1 mrg ca = component_of (comp_father, ia);
1058 1.1 mrg if (ca != bad)
1059 1.1 mrg comps[ca]->eliminate_store_p = false;
1060 1.1 mrg }
1061 1.1 mrg }
1062 1.1 mrg
1063 1.1 mrg for (i = 0; i < n; i++)
1064 1.1 mrg {
1065 1.1 mrg comp = comps[i];
1066 1.1 mrg if (comp)
1067 1.1 mrg {
1068 1.1 mrg comp->next = comp_list;
1069 1.1 mrg comp_list = comp;
1070 1.1 mrg }
1071 1.1 mrg }
1072 1.1 mrg return comp_list;
1073 1.1 mrg }
1074 1.1 mrg
1075 1.1 mrg /* Returns true if the component COMP satisfies the conditions
1076 1.1 mrg described in 2) at the beginning of this file. */
1077 1.1 mrg
1078 1.1 mrg bool
1079 1.1 mrg pcom_worker::suitable_component_p (struct component *comp)
1080 1.1 mrg {
1081 1.1 mrg unsigned i;
1082 1.1 mrg dref a, first;
1083 1.1 mrg basic_block ba, bp = m_loop->header;
1084 1.1 mrg bool ok, has_write = false;
1085 1.1 mrg
1086 1.1 mrg FOR_EACH_VEC_ELT (comp->refs, i, a)
1087 1.1 mrg {
1088 1.1 mrg ba = gimple_bb (a->stmt);
1089 1.1 mrg
1090 1.1 mrg if (!just_once_each_iteration_p (m_loop, ba))
1091 1.1 mrg return false;
1092 1.1 mrg
1093 1.1 mrg gcc_assert (dominated_by_p (CDI_DOMINATORS, ba, bp));
1094 1.1 mrg bp = ba;
1095 1.1 mrg
1096 1.1 mrg if (DR_IS_WRITE (a->ref))
1097 1.1 mrg has_write = true;
1098 1.1 mrg }
1099 1.1 mrg
1100 1.1 mrg first = comp->refs[0];
1101 1.1 mrg ok = suitable_reference_p (first->ref, &comp->comp_step);
1102 1.1 mrg gcc_assert (ok);
1103 1.1 mrg first->offset = 0;
1104 1.1 mrg
1105 1.1 mrg for (i = 1; comp->refs.iterate (i, &a); i++)
1106 1.1 mrg {
1107 1.1 mrg /* Polynomial offsets are no use, since we need to know the
1108 1.1 mrg gap between iteration numbers at compile time. */
1109 1.1 mrg poly_widest_int offset;
1110 1.1 mrg if (!determine_offset (first->ref, a->ref, &offset)
1111 1.1 mrg || !offset.is_constant (&a->offset))
1112 1.1 mrg return false;
1113 1.1 mrg
1114 1.1 mrg enum ref_step_type a_step;
1115 1.1 mrg gcc_checking_assert (suitable_reference_p (a->ref, &a_step)
1116 1.1 mrg && a_step == comp->comp_step);
1117 1.1 mrg }
1118 1.1 mrg
1119 1.1 mrg /* If there is a write inside the component, we must know whether the
1120 1.1 mrg step is nonzero or not -- we would not otherwise be able to recognize
1121 1.1 mrg whether the value accessed by reads comes from the OFFSET-th iteration
1122 1.1 mrg or the previous one. */
1123 1.1 mrg if (has_write && comp->comp_step == RS_ANY)
1124 1.1 mrg return false;
1125 1.1 mrg
1126 1.1 mrg return true;
1127 1.1 mrg }
1128 1.1 mrg
1129 1.1 mrg /* Check the conditions on references inside each of components COMPS,
1130 1.1 mrg and remove the unsuitable components from the list. The new list
1131 1.1 mrg of components is returned. The conditions are described in 2) at
1132 1.1 mrg the beginning of this file. */
1133 1.1 mrg
1134 1.1 mrg struct component *
1135 1.1 mrg pcom_worker::filter_suitable_components (struct component *comps)
1136 1.1 mrg {
1137 1.1 mrg struct component **comp, *act;
1138 1.1 mrg
1139 1.1 mrg for (comp = &comps; *comp; )
1140 1.1 mrg {
1141 1.1 mrg act = *comp;
1142 1.1 mrg if (suitable_component_p (act))
1143 1.1 mrg comp = &act->next;
1144 1.1 mrg else
1145 1.1 mrg {
1146 1.1 mrg dref ref;
1147 1.1 mrg unsigned i;
1148 1.1 mrg
1149 1.1 mrg *comp = act->next;
1150 1.1 mrg FOR_EACH_VEC_ELT (act->refs, i, ref)
1151 1.1 mrg free (ref);
1152 1.1 mrg delete act;
1153 1.1 mrg }
1154 1.1 mrg }
1155 1.1 mrg
1156 1.1 mrg return comps;
1157 1.1 mrg }
1158 1.1 mrg
1159 1.1 mrg /* Compares two drefs A and B by their offset and position. Callback for
1160 1.1 mrg qsort. */
1161 1.1 mrg
1162 1.1 mrg static int
1163 1.1 mrg order_drefs (const void *a, const void *b)
1164 1.1 mrg {
1165 1.1 mrg const dref *const da = (const dref *) a;
1166 1.1 mrg const dref *const db = (const dref *) b;
1167 1.1 mrg int offcmp = wi::cmps ((*da)->offset, (*db)->offset);
1168 1.1 mrg
1169 1.1 mrg if (offcmp != 0)
1170 1.1 mrg return offcmp;
1171 1.1 mrg
1172 1.1 mrg return (*da)->pos - (*db)->pos;
1173 1.1 mrg }
1174 1.1 mrg
1175 1.1 mrg /* Compares two drefs A and B by their position. Callback for qsort. */
1176 1.1 mrg
1177 1.1 mrg static int
1178 1.1 mrg order_drefs_by_pos (const void *a, const void *b)
1179 1.1 mrg {
1180 1.1 mrg const dref *const da = (const dref *) a;
1181 1.1 mrg const dref *const db = (const dref *) b;
1182 1.1 mrg
1183 1.1 mrg return (*da)->pos - (*db)->pos;
1184 1.1 mrg }
1185 1.1 mrg
1186 1.1 mrg /* Returns root of the CHAIN. */
1187 1.1 mrg
1188 1.1 mrg static inline dref
1189 1.1 mrg get_chain_root (chain_p chain)
1190 1.1 mrg {
1191 1.1 mrg return chain->refs[0];
1192 1.1 mrg }
1193 1.1 mrg
1194 1.1 mrg /* Given CHAIN, returns the last write ref at DISTANCE, or NULL if it doesn't
1195 1.1 mrg exist. */
1196 1.1 mrg
1197 1.1 mrg static inline dref
1198 1.1 mrg get_chain_last_write_at (chain_p chain, unsigned distance)
1199 1.1 mrg {
1200 1.1 mrg for (unsigned i = chain->refs.length (); i > 0; i--)
1201 1.1 mrg if (DR_IS_WRITE (chain->refs[i - 1]->ref)
1202 1.1 mrg && distance == chain->refs[i - 1]->distance)
1203 1.1 mrg return chain->refs[i - 1];
1204 1.1 mrg
1205 1.1 mrg return NULL;
1206 1.1 mrg }
1207 1.1 mrg
1208 1.1 mrg /* Given CHAIN, returns the last write ref with the same distance before load
1209 1.1 mrg at index LOAD_IDX, or NULL if it doesn't exist. */
1210 1.1 mrg
1211 1.1 mrg static inline dref
1212 1.1 mrg get_chain_last_write_before_load (chain_p chain, unsigned load_idx)
1213 1.1 mrg {
1214 1.1 mrg gcc_assert (load_idx < chain->refs.length ());
1215 1.1 mrg
1216 1.1 mrg unsigned distance = chain->refs[load_idx]->distance;
1217 1.1 mrg
1218 1.1 mrg for (unsigned i = load_idx; i > 0; i--)
1219 1.1 mrg if (DR_IS_WRITE (chain->refs[i - 1]->ref)
1220 1.1 mrg && distance == chain->refs[i - 1]->distance)
1221 1.1 mrg return chain->refs[i - 1];
1222 1.1 mrg
1223 1.1 mrg return NULL;
1224 1.1 mrg }
1225 1.1 mrg
1226 1.1 mrg /* Adds REF to the chain CHAIN. */
1227 1.1 mrg
1228 1.1 mrg static void
1229 1.1 mrg add_ref_to_chain (chain_p chain, dref ref)
1230 1.1 mrg {
1231 1.1 mrg dref root = get_chain_root (chain);
1232 1.1 mrg
1233 1.1 mrg gcc_assert (wi::les_p (root->offset, ref->offset));
1234 1.1 mrg widest_int dist = ref->offset - root->offset;
1235 1.1 mrg gcc_assert (wi::fits_uhwi_p (dist));
1236 1.1 mrg
1237 1.1 mrg chain->refs.safe_push (ref);
1238 1.1 mrg
1239 1.1 mrg ref->distance = dist.to_uhwi ();
1240 1.1 mrg
1241 1.1 mrg if (ref->distance >= chain->length)
1242 1.1 mrg {
1243 1.1 mrg chain->length = ref->distance;
1244 1.1 mrg chain->has_max_use_after = false;
1245 1.1 mrg }
1246 1.1 mrg
1247 1.1 mrg /* Promote this chain to CT_STORE_STORE if it has multiple stores. */
1248 1.1 mrg if (DR_IS_WRITE (ref->ref))
1249 1.1 mrg chain->type = CT_STORE_STORE;
1250 1.1 mrg
1251 1.1 mrg /* Don't set the flag for store-store chain since there is no use. */
1252 1.1 mrg if (chain->type != CT_STORE_STORE
1253 1.1 mrg && ref->distance == chain->length
1254 1.1 mrg && ref->pos > root->pos)
1255 1.1 mrg chain->has_max_use_after = true;
1256 1.1 mrg
1257 1.1 mrg chain->all_always_accessed &= ref->always_accessed;
1258 1.1 mrg }
1259 1.1 mrg
1260 1.1 mrg /* Returns the chain for invariant component COMP. */
1261 1.1 mrg
1262 1.1 mrg static chain_p
1263 1.1 mrg make_invariant_chain (struct component *comp)
1264 1.1 mrg {
1265 1.1 mrg chain_p chain = new struct chain (CT_INVARIANT);
1266 1.1 mrg unsigned i;
1267 1.1 mrg dref ref;
1268 1.1 mrg
1269 1.1 mrg chain->all_always_accessed = true;
1270 1.1 mrg
1271 1.1 mrg FOR_EACH_VEC_ELT (comp->refs, i, ref)
1272 1.1 mrg {
1273 1.1 mrg chain->refs.safe_push (ref);
1274 1.1 mrg chain->all_always_accessed &= ref->always_accessed;
1275 1.1 mrg }
1276 1.1 mrg
1277 1.1 mrg chain->inits = vNULL;
1278 1.1 mrg chain->finis = vNULL;
1279 1.1 mrg
1280 1.1 mrg return chain;
1281 1.1 mrg }
1282 1.1 mrg
1283 1.1 mrg /* Make a new chain of type TYPE rooted at REF. */
1284 1.1 mrg
1285 1.1 mrg static chain_p
1286 1.1 mrg make_rooted_chain (dref ref, enum chain_type type)
1287 1.1 mrg {
1288 1.1 mrg chain_p chain = new struct chain (type);
1289 1.1 mrg
1290 1.1 mrg chain->refs.safe_push (ref);
1291 1.1 mrg chain->all_always_accessed = ref->always_accessed;
1292 1.1 mrg ref->distance = 0;
1293 1.1 mrg
1294 1.1 mrg chain->inits = vNULL;
1295 1.1 mrg chain->finis = vNULL;
1296 1.1 mrg
1297 1.1 mrg return chain;
1298 1.1 mrg }
1299 1.1 mrg
1300 1.1 mrg /* Returns true if CHAIN is not trivial. */
1301 1.1 mrg
1302 1.1 mrg static bool
1303 1.1 mrg nontrivial_chain_p (chain_p chain)
1304 1.1 mrg {
1305 1.1 mrg return chain != NULL && chain->refs.length () > 1;
1306 1.1 mrg }
1307 1.1 mrg
1308 1.1 mrg /* Returns the ssa name that contains the value of REF, or NULL_TREE if there
1309 1.1 mrg is no such name. */
1310 1.1 mrg
1311 1.1 mrg static tree
1312 1.1 mrg name_for_ref (dref ref)
1313 1.1 mrg {
1314 1.1 mrg tree name;
1315 1.1 mrg
1316 1.1 mrg if (is_gimple_assign (ref->stmt))
1317 1.1 mrg {
1318 1.1 mrg if (!ref->ref || DR_IS_READ (ref->ref))
1319 1.1 mrg name = gimple_assign_lhs (ref->stmt);
1320 1.1 mrg else
1321 1.1 mrg name = gimple_assign_rhs1 (ref->stmt);
1322 1.1 mrg }
1323 1.1 mrg else
1324 1.1 mrg name = PHI_RESULT (ref->stmt);
1325 1.1 mrg
1326 1.1 mrg return (TREE_CODE (name) == SSA_NAME ? name : NULL_TREE);
1327 1.1 mrg }
1328 1.1 mrg
1329 1.1 mrg /* Returns true if REF is a valid initializer for ROOT with given DISTANCE (in
1330 1.1 mrg iterations of the innermost enclosing loop). */
1331 1.1 mrg
1332 1.1 mrg bool
1333 1.1 mrg pcom_worker::valid_initializer_p (struct data_reference *ref, unsigned distance,
1334 1.1 mrg struct data_reference *root)
1335 1.1 mrg {
1336 1.1 mrg aff_tree diff, base, step;
1337 1.1 mrg poly_widest_int off;
1338 1.1 mrg
1339 1.1 mrg /* Both REF and ROOT must be accessing the same object. */
1340 1.1 mrg if (!operand_equal_p (DR_BASE_ADDRESS (ref), DR_BASE_ADDRESS (root), 0))
1341 1.1 mrg return false;
1342 1.1 mrg
1343 1.1 mrg /* The initializer is defined outside of loop, hence its address must be
1344 1.1 mrg invariant inside the loop. */
1345 1.1 mrg gcc_assert (integer_zerop (DR_STEP (ref)));
1346 1.1 mrg
1347 1.1 mrg /* If the address of the reference is invariant, initializer must access
1348 1.1 mrg exactly the same location. */
1349 1.1 mrg if (integer_zerop (DR_STEP (root)))
1350 1.1 mrg return (operand_equal_p (DR_OFFSET (ref), DR_OFFSET (root), 0)
1351 1.1 mrg && operand_equal_p (DR_INIT (ref), DR_INIT (root), 0));
1352 1.1 mrg
1353 1.1 mrg /* Verify that this index of REF is equal to the root's index at
1354 1.1 mrg -DISTANCE-th iteration. */
1355 1.1 mrg aff_combination_dr_offset (root, &diff);
1356 1.1 mrg aff_combination_dr_offset (ref, &base);
1357 1.1 mrg aff_combination_scale (&base, -1);
1358 1.1 mrg aff_combination_add (&diff, &base);
1359 1.1 mrg
1360 1.1 mrg tree_to_aff_combination_expand (DR_STEP (root), TREE_TYPE (DR_STEP (root)),
1361 1.1 mrg &step, &m_cache);
1362 1.1 mrg if (!aff_combination_constant_multiple_p (&diff, &step, &off))
1363 1.1 mrg return false;
1364 1.1 mrg
1365 1.1 mrg if (maybe_ne (off, distance))
1366 1.1 mrg return false;
1367 1.1 mrg
1368 1.1 mrg return true;
1369 1.1 mrg }
1370 1.1 mrg
1371 1.1 mrg /* Finds looparound phi node of loop that copies the value of REF, and if its
1372 1.1 mrg initial value is correct (equal to initial value of REF shifted by one
1373 1.1 mrg iteration), returns the phi node. Otherwise, NULL_TREE is returned. ROOT
1374 1.1 mrg is the root of the current chain. */
1375 1.1 mrg
1376 1.1 mrg gphi *
1377 1.1 mrg pcom_worker::find_looparound_phi (dref ref, dref root)
1378 1.1 mrg {
1379 1.1 mrg tree name, init, init_ref;
1380 1.1 mrg gimple *init_stmt;
1381 1.1 mrg edge latch = loop_latch_edge (m_loop);
1382 1.1 mrg struct data_reference init_dr;
1383 1.1 mrg gphi_iterator psi;
1384 1.1 mrg
1385 1.1 mrg if (is_gimple_assign (ref->stmt))
1386 1.1 mrg {
1387 1.1 mrg if (DR_IS_READ (ref->ref))
1388 1.1 mrg name = gimple_assign_lhs (ref->stmt);
1389 1.1 mrg else
1390 1.1 mrg name = gimple_assign_rhs1 (ref->stmt);
1391 1.1 mrg }
1392 1.1 mrg else
1393 1.1 mrg name = PHI_RESULT (ref->stmt);
1394 1.1 mrg if (!name)
1395 1.1 mrg return NULL;
1396 1.1 mrg
1397 1.1 mrg tree entry_vuse = NULL_TREE;
1398 1.1 mrg gphi *phi = NULL;
1399 1.1 mrg for (psi = gsi_start_phis (m_loop->header); !gsi_end_p (psi); gsi_next (&psi))
1400 1.1 mrg {
1401 1.1 mrg gphi *p = psi.phi ();
1402 1.1 mrg if (PHI_ARG_DEF_FROM_EDGE (p, latch) == name)
1403 1.1 mrg phi = p;
1404 1.1 mrg else if (virtual_operand_p (gimple_phi_result (p)))
1405 1.1 mrg entry_vuse = PHI_ARG_DEF_FROM_EDGE (p, loop_preheader_edge (m_loop));
1406 1.1 mrg if (phi && entry_vuse)
1407 1.1 mrg break;
1408 1.1 mrg }
1409 1.1 mrg if (!phi || !entry_vuse)
1410 1.1 mrg return NULL;
1411 1.1 mrg
1412 1.1 mrg init = PHI_ARG_DEF_FROM_EDGE (phi, loop_preheader_edge (m_loop));
1413 1.1 mrg if (TREE_CODE (init) != SSA_NAME)
1414 1.1 mrg return NULL;
1415 1.1 mrg init_stmt = SSA_NAME_DEF_STMT (init);
1416 1.1 mrg if (gimple_code (init_stmt) != GIMPLE_ASSIGN)
1417 1.1 mrg return NULL;
1418 1.1 mrg gcc_assert (gimple_assign_lhs (init_stmt) == init);
1419 1.1 mrg
1420 1.1 mrg init_ref = gimple_assign_rhs1 (init_stmt);
1421 1.1 mrg if (!REFERENCE_CLASS_P (init_ref)
1422 1.1 mrg && !DECL_P (init_ref))
1423 1.1 mrg return NULL;
1424 1.1 mrg
1425 1.1 mrg /* Analyze the behavior of INIT_REF with respect to LOOP (innermost
1426 1.1 mrg loop enclosing PHI). */
1427 1.1 mrg memset (&init_dr, 0, sizeof (struct data_reference));
1428 1.1 mrg DR_REF (&init_dr) = init_ref;
1429 1.1 mrg DR_STMT (&init_dr) = phi;
1430 1.1 mrg if (!dr_analyze_innermost (&DR_INNERMOST (&init_dr), init_ref, m_loop,
1431 1.1 mrg init_stmt))
1432 1.1 mrg return NULL;
1433 1.1 mrg
1434 1.1 mrg if (!valid_initializer_p (&init_dr, ref->distance + 1, root->ref))
1435 1.1 mrg return NULL;
1436 1.1 mrg
1437 1.1 mrg /* Make sure nothing clobbers the location we re-use the initial value
1438 1.1 mrg from. */
1439 1.1 mrg if (entry_vuse != gimple_vuse (init_stmt))
1440 1.1 mrg {
1441 1.1 mrg ao_ref ref;
1442 1.1 mrg ao_ref_init (&ref, init_ref);
1443 1.1 mrg unsigned limit = param_sccvn_max_alias_queries_per_access;
1444 1.1 mrg tree vdef = entry_vuse;
1445 1.1 mrg do
1446 1.1 mrg {
1447 1.1 mrg gimple *def = SSA_NAME_DEF_STMT (vdef);
1448 1.1 mrg if (limit-- == 0 || gimple_code (def) == GIMPLE_PHI)
1449 1.1 mrg return NULL;
1450 1.1 mrg if (stmt_may_clobber_ref_p_1 (def, &ref))
1451 1.1 mrg /* When the stmt is an assign to init_ref we could in theory
1452 1.1 mrg use its RHS for the initial value of the looparound PHI
1453 1.1 mrg we replace in prepare_initializers_chain, but we have
1454 1.1 mrg no convenient place to store this info at the moment. */
1455 1.1 mrg return NULL;
1456 1.1 mrg vdef = gimple_vuse (def);
1457 1.1 mrg }
1458 1.1 mrg while (vdef != gimple_vuse (init_stmt));
1459 1.1 mrg }
1460 1.1 mrg
1461 1.1 mrg return phi;
1462 1.1 mrg }
1463 1.1 mrg
1464 1.1 mrg /* Adds a reference for the looparound copy of REF in PHI to CHAIN. */
1465 1.1 mrg
1466 1.1 mrg static void
1467 1.1 mrg insert_looparound_copy (chain_p chain, dref ref, gphi *phi)
1468 1.1 mrg {
1469 1.1 mrg dref nw = XCNEW (class dref_d), aref;
1470 1.1 mrg unsigned i;
1471 1.1 mrg
1472 1.1 mrg nw->stmt = phi;
1473 1.1 mrg nw->distance = ref->distance + 1;
1474 1.1 mrg nw->always_accessed = 1;
1475 1.1 mrg
1476 1.1 mrg FOR_EACH_VEC_ELT (chain->refs, i, aref)
1477 1.1 mrg if (aref->distance >= nw->distance)
1478 1.1 mrg break;
1479 1.1 mrg chain->refs.safe_insert (i, nw);
1480 1.1 mrg
1481 1.1 mrg if (nw->distance > chain->length)
1482 1.1 mrg {
1483 1.1 mrg chain->length = nw->distance;
1484 1.1 mrg chain->has_max_use_after = false;
1485 1.1 mrg }
1486 1.1 mrg }
1487 1.1 mrg
1488 1.1 mrg /* For references in CHAIN that are copied around the loop (created previously
1489 1.1 mrg by PRE, or by user), add the results of such copies to the chain. This
1490 1.1 mrg enables us to remove the copies by unrolling, and may need less registers
1491 1.1 mrg (also, it may allow us to combine chains together). */
1492 1.1 mrg
1493 1.1 mrg void
1494 1.1 mrg pcom_worker::add_looparound_copies (chain_p chain)
1495 1.1 mrg {
1496 1.1 mrg unsigned i;
1497 1.1 mrg dref ref, root = get_chain_root (chain);
1498 1.1 mrg gphi *phi;
1499 1.1 mrg
1500 1.1 mrg if (chain->type == CT_STORE_STORE)
1501 1.1 mrg return;
1502 1.1 mrg
1503 1.1 mrg FOR_EACH_VEC_ELT (chain->refs, i, ref)
1504 1.1 mrg {
1505 1.1 mrg phi = find_looparound_phi (ref, root);
1506 1.1 mrg if (!phi)
1507 1.1 mrg continue;
1508 1.1 mrg
1509 1.1 mrg bitmap_set_bit (m_looparound_phis, SSA_NAME_VERSION (PHI_RESULT (phi)));
1510 1.1 mrg insert_looparound_copy (chain, ref, phi);
1511 1.1 mrg }
1512 1.1 mrg }
1513 1.1 mrg
1514 1.1 mrg /* Find roots of the values and determine distances in the component COMP.
1515 1.1 mrg The references are redistributed into chains. */
1516 1.1 mrg
1517 1.1 mrg void
1518 1.1 mrg pcom_worker::determine_roots_comp (struct component *comp)
1519 1.1 mrg {
1520 1.1 mrg unsigned i;
1521 1.1 mrg dref a;
1522 1.1 mrg chain_p chain = NULL;
1523 1.1 mrg widest_int last_ofs = 0;
1524 1.1 mrg enum chain_type type;
1525 1.1 mrg
1526 1.1 mrg /* Invariants are handled specially. */
1527 1.1 mrg if (comp->comp_step == RS_INVARIANT)
1528 1.1 mrg {
1529 1.1 mrg chain = make_invariant_chain (comp);
1530 1.1 mrg m_chains.safe_push (chain);
1531 1.1 mrg return;
1532 1.1 mrg }
1533 1.1 mrg
1534 1.1 mrg /* Trivial component. */
1535 1.1 mrg if (comp->refs.length () <= 1)
1536 1.1 mrg {
1537 1.1 mrg if (comp->refs.length () == 1)
1538 1.1 mrg {
1539 1.1 mrg free (comp->refs[0]);
1540 1.1 mrg comp->refs.truncate (0);
1541 1.1 mrg }
1542 1.1 mrg return;
1543 1.1 mrg }
1544 1.1 mrg
1545 1.1 mrg comp->refs.qsort (order_drefs);
1546 1.1 mrg
1547 1.1 mrg /* For Store-Store chain, we only support load if it is dominated by a
1548 1.1 mrg store statement in the same iteration of loop. */
1549 1.1 mrg if (comp->eliminate_store_p)
1550 1.1 mrg for (a = NULL, i = 0; i < comp->refs.length (); i++)
1551 1.1 mrg {
1552 1.1 mrg if (DR_IS_WRITE (comp->refs[i]->ref))
1553 1.1 mrg a = comp->refs[i];
1554 1.1 mrg else if (a == NULL || a->offset != comp->refs[i]->offset)
1555 1.1 mrg {
1556 1.1 mrg /* If there is load that is not dominated by a store in the
1557 1.1 mrg same iteration of loop, clear the flag so no Store-Store
1558 1.1 mrg chain is generated for this component. */
1559 1.1 mrg comp->eliminate_store_p = false;
1560 1.1 mrg break;
1561 1.1 mrg }
1562 1.1 mrg }
1563 1.1 mrg
1564 1.1 mrg /* Determine roots and create chains for components. */
1565 1.1 mrg FOR_EACH_VEC_ELT (comp->refs, i, a)
1566 1.1 mrg {
1567 1.1 mrg if (!chain
1568 1.1 mrg || (chain->type == CT_LOAD && DR_IS_WRITE (a->ref))
1569 1.1 mrg || (!comp->eliminate_store_p && DR_IS_WRITE (a->ref))
1570 1.1 mrg || wi::leu_p (MAX_DISTANCE, a->offset - last_ofs))
1571 1.1 mrg {
1572 1.1 mrg if (nontrivial_chain_p (chain))
1573 1.1 mrg {
1574 1.1 mrg add_looparound_copies (chain);
1575 1.1 mrg m_chains.safe_push (chain);
1576 1.1 mrg }
1577 1.1 mrg else
1578 1.1 mrg release_chain (chain);
1579 1.1 mrg
1580 1.1 mrg /* Determine type of the chain. If the root reference is a load,
1581 1.1 mrg this can only be a CT_LOAD chain; other chains are intialized
1582 1.1 mrg to CT_STORE_LOAD and might be promoted to CT_STORE_STORE when
1583 1.1 mrg new reference is added. */
1584 1.1 mrg type = DR_IS_READ (a->ref) ? CT_LOAD : CT_STORE_LOAD;
1585 1.1 mrg chain = make_rooted_chain (a, type);
1586 1.1 mrg last_ofs = a->offset;
1587 1.1 mrg continue;
1588 1.1 mrg }
1589 1.1 mrg
1590 1.1 mrg add_ref_to_chain (chain, a);
1591 1.1 mrg }
1592 1.1 mrg
1593 1.1 mrg if (nontrivial_chain_p (chain))
1594 1.1 mrg {
1595 1.1 mrg add_looparound_copies (chain);
1596 1.1 mrg m_chains.safe_push (chain);
1597 1.1 mrg }
1598 1.1 mrg else
1599 1.1 mrg release_chain (chain);
1600 1.1 mrg }
1601 1.1 mrg
1602 1.1 mrg /* Find roots of the values and determine distances in components COMPS, and
1603 1.1 mrg separates the references to chains. */
1604 1.1 mrg
1605 1.1 mrg void
1606 1.1 mrg pcom_worker::determine_roots (struct component *comps)
1607 1.1 mrg {
1608 1.1 mrg struct component *comp;
1609 1.1 mrg
1610 1.1 mrg for (comp = comps; comp; comp = comp->next)
1611 1.1 mrg determine_roots_comp (comp);
1612 1.1 mrg }
1613 1.1 mrg
1614 1.1 mrg /* Replace the reference in statement STMT with temporary variable
1615 1.1 mrg NEW_TREE. If SET is true, NEW_TREE is instead initialized to the value of
1616 1.1 mrg the reference in the statement. IN_LHS is true if the reference
1617 1.1 mrg is in the lhs of STMT, false if it is in rhs. */
1618 1.1 mrg
1619 1.1 mrg static void
1620 1.1 mrg replace_ref_with (gimple *stmt, tree new_tree, bool set, bool in_lhs)
1621 1.1 mrg {
1622 1.1 mrg tree val;
1623 1.1 mrg gassign *new_stmt;
1624 1.1 mrg gimple_stmt_iterator bsi, psi;
1625 1.1 mrg
1626 1.1 mrg if (gimple_code (stmt) == GIMPLE_PHI)
1627 1.1 mrg {
1628 1.1 mrg gcc_assert (!in_lhs && !set);
1629 1.1 mrg
1630 1.1 mrg val = PHI_RESULT (stmt);
1631 1.1 mrg bsi = gsi_after_labels (gimple_bb (stmt));
1632 1.1 mrg psi = gsi_for_stmt (stmt);
1633 1.1 mrg remove_phi_node (&psi, false);
1634 1.1 mrg
1635 1.1 mrg /* Turn the phi node into GIMPLE_ASSIGN. */
1636 1.1 mrg new_stmt = gimple_build_assign (val, new_tree);
1637 1.1 mrg gsi_insert_before (&bsi, new_stmt, GSI_NEW_STMT);
1638 1.1 mrg return;
1639 1.1 mrg }
1640 1.1 mrg
1641 1.1 mrg /* Since the reference is of gimple_reg type, it should only
1642 1.1 mrg appear as lhs or rhs of modify statement. */
1643 1.1 mrg gcc_assert (is_gimple_assign (stmt));
1644 1.1 mrg
1645 1.1 mrg bsi = gsi_for_stmt (stmt);
1646 1.1 mrg
1647 1.1 mrg /* If we do not need to initialize NEW_TREE, just replace the use of OLD. */
1648 1.1 mrg if (!set)
1649 1.1 mrg {
1650 1.1 mrg gcc_assert (!in_lhs);
1651 1.1 mrg gimple_assign_set_rhs_from_tree (&bsi, new_tree);
1652 1.1 mrg stmt = gsi_stmt (bsi);
1653 1.1 mrg update_stmt (stmt);
1654 1.1 mrg return;
1655 1.1 mrg }
1656 1.1 mrg
1657 1.1 mrg if (in_lhs)
1658 1.1 mrg {
1659 1.1 mrg /* We have statement
1660 1.1 mrg
1661 1.1 mrg OLD = VAL
1662 1.1 mrg
1663 1.1 mrg If OLD is a memory reference, then VAL is gimple_val, and we transform
1664 1.1 mrg this to
1665 1.1 mrg
1666 1.1 mrg OLD = VAL
1667 1.1 mrg NEW = VAL
1668 1.1 mrg
1669 1.1 mrg Otherwise, we are replacing a combination chain,
1670 1.1 mrg VAL is the expression that performs the combination, and OLD is an
1671 1.1 mrg SSA name. In this case, we transform the assignment to
1672 1.1 mrg
1673 1.1 mrg OLD = VAL
1674 1.1 mrg NEW = OLD
1675 1.1 mrg
1676 1.1 mrg */
1677 1.1 mrg
1678 1.1 mrg val = gimple_assign_lhs (stmt);
1679 1.1 mrg if (TREE_CODE (val) != SSA_NAME)
1680 1.1 mrg {
1681 1.1 mrg val = gimple_assign_rhs1 (stmt);
1682 1.1 mrg gcc_assert (gimple_assign_single_p (stmt));
1683 1.1 mrg if (TREE_CLOBBER_P (val))
1684 1.1 mrg val = get_or_create_ssa_default_def (cfun, SSA_NAME_VAR (new_tree));
1685 1.1 mrg else
1686 1.1 mrg gcc_assert (gimple_assign_copy_p (stmt));
1687 1.1 mrg }
1688 1.1 mrg }
1689 1.1 mrg else
1690 1.1 mrg {
1691 1.1 mrg /* VAL = OLD
1692 1.1 mrg
1693 1.1 mrg is transformed to
1694 1.1 mrg
1695 1.1 mrg VAL = OLD
1696 1.1 mrg NEW = VAL */
1697 1.1 mrg
1698 1.1 mrg val = gimple_assign_lhs (stmt);
1699 1.1 mrg }
1700 1.1 mrg
1701 1.1 mrg new_stmt = gimple_build_assign (new_tree, unshare_expr (val));
1702 1.1 mrg gsi_insert_after (&bsi, new_stmt, GSI_NEW_STMT);
1703 1.1 mrg }
1704 1.1 mrg
1705 1.1 mrg /* Returns a memory reference to DR in the (NITERS + ITER)-th iteration
1706 1.1 mrg of the loop it was analyzed in. Append init stmts to STMTS. */
1707 1.1 mrg
1708 1.1 mrg static tree
1709 1.1 mrg ref_at_iteration (data_reference_p dr, int iter,
1710 1.1 mrg gimple_seq *stmts, tree niters = NULL_TREE)
1711 1.1 mrg {
1712 1.1 mrg tree off = DR_OFFSET (dr);
1713 1.1 mrg tree coff = DR_INIT (dr);
1714 1.1 mrg tree ref = DR_REF (dr);
1715 1.1 mrg enum tree_code ref_code = ERROR_MARK;
1716 1.1 mrg tree ref_type = NULL_TREE;
1717 1.1 mrg tree ref_op1 = NULL_TREE;
1718 1.1 mrg tree ref_op2 = NULL_TREE;
1719 1.1 mrg tree new_offset;
1720 1.1 mrg
1721 1.1 mrg if (iter != 0)
1722 1.1 mrg {
1723 1.1 mrg new_offset = size_binop (MULT_EXPR, DR_STEP (dr), ssize_int (iter));
1724 1.1 mrg if (TREE_CODE (new_offset) == INTEGER_CST)
1725 1.1 mrg coff = size_binop (PLUS_EXPR, coff, new_offset);
1726 1.1 mrg else
1727 1.1 mrg off = size_binop (PLUS_EXPR, off, new_offset);
1728 1.1 mrg }
1729 1.1 mrg
1730 1.1 mrg if (niters != NULL_TREE)
1731 1.1 mrg {
1732 1.1 mrg niters = fold_convert (ssizetype, niters);
1733 1.1 mrg new_offset = size_binop (MULT_EXPR, DR_STEP (dr), niters);
1734 1.1 mrg if (TREE_CODE (niters) == INTEGER_CST)
1735 1.1 mrg coff = size_binop (PLUS_EXPR, coff, new_offset);
1736 1.1 mrg else
1737 1.1 mrg off = size_binop (PLUS_EXPR, off, new_offset);
1738 1.1 mrg }
1739 1.1 mrg
1740 1.1 mrg /* While data-ref analysis punts on bit offsets it still handles
1741 1.1 mrg bitfield accesses at byte boundaries. Cope with that. Note that
1742 1.1 mrg if the bitfield object also starts at a byte-boundary we can simply
1743 1.1 mrg replicate the COMPONENT_REF, but we have to subtract the component's
1744 1.1 mrg byte-offset from the MEM_REF address first.
1745 1.1 mrg Otherwise we simply build a BIT_FIELD_REF knowing that the bits
1746 1.1 mrg start at offset zero. */
1747 1.1 mrg if (TREE_CODE (ref) == COMPONENT_REF
1748 1.1 mrg && DECL_BIT_FIELD (TREE_OPERAND (ref, 1)))
1749 1.1 mrg {
1750 1.1 mrg unsigned HOST_WIDE_INT boff;
1751 1.1 mrg tree field = TREE_OPERAND (ref, 1);
1752 1.1 mrg tree offset = component_ref_field_offset (ref);
1753 1.1 mrg ref_type = TREE_TYPE (ref);
1754 1.1 mrg boff = tree_to_uhwi (DECL_FIELD_BIT_OFFSET (field));
1755 1.1 mrg /* This can occur in Ada. See the comment in get_bit_range. */
1756 1.1 mrg if (boff % BITS_PER_UNIT != 0
1757 1.1 mrg || !tree_fits_uhwi_p (offset))
1758 1.1 mrg {
1759 1.1 mrg ref_code = BIT_FIELD_REF;
1760 1.1 mrg ref_op1 = DECL_SIZE (field);
1761 1.1 mrg ref_op2 = bitsize_zero_node;
1762 1.1 mrg }
1763 1.1 mrg else
1764 1.1 mrg {
1765 1.1 mrg boff >>= LOG2_BITS_PER_UNIT;
1766 1.1 mrg boff += tree_to_uhwi (offset);
1767 1.1 mrg coff = size_binop (MINUS_EXPR, coff, ssize_int (boff));
1768 1.1 mrg ref_code = COMPONENT_REF;
1769 1.1 mrg ref_op1 = field;
1770 1.1 mrg ref_op2 = TREE_OPERAND (ref, 2);
1771 1.1 mrg ref = TREE_OPERAND (ref, 0);
1772 1.1 mrg }
1773 1.1 mrg }
1774 1.1 mrg /* We may not associate the constant offset across the pointer plus
1775 1.1 mrg expression because that might form a pointer to before the object
1776 1.1 mrg then. But for some cases we can retain that to allow tree_could_trap_p
1777 1.1 mrg to return false - see gcc.dg/tree-ssa/predcom-1.c */
1778 1.1 mrg tree addr, alias_ptr;
1779 1.1 mrg if (integer_zerop (off))
1780 1.1 mrg {
1781 1.1 mrg alias_ptr = fold_convert (reference_alias_ptr_type (ref), coff);
1782 1.1 mrg addr = DR_BASE_ADDRESS (dr);
1783 1.1 mrg }
1784 1.1 mrg else
1785 1.1 mrg {
1786 1.1 mrg alias_ptr = build_zero_cst (reference_alias_ptr_type (ref));
1787 1.1 mrg off = size_binop (PLUS_EXPR, off, coff);
1788 1.1 mrg addr = fold_build_pointer_plus (DR_BASE_ADDRESS (dr), off);
1789 1.1 mrg }
1790 1.1 mrg addr = force_gimple_operand_1 (unshare_expr (addr), stmts,
1791 1.1 mrg is_gimple_mem_ref_addr, NULL_TREE);
1792 1.1 mrg tree type = build_aligned_type (TREE_TYPE (ref),
1793 1.1 mrg get_object_alignment (ref));
1794 1.1 mrg ref = build2 (MEM_REF, type, addr, alias_ptr);
1795 1.1 mrg if (ref_type)
1796 1.1 mrg ref = build3 (ref_code, ref_type, ref, ref_op1, ref_op2);
1797 1.1 mrg return ref;
1798 1.1 mrg }
1799 1.1 mrg
1800 1.1 mrg /* Get the initialization expression for the INDEX-th temporary variable
1801 1.1 mrg of CHAIN. */
1802 1.1 mrg
1803 1.1 mrg static tree
1804 1.1 mrg get_init_expr (chain_p chain, unsigned index)
1805 1.1 mrg {
1806 1.1 mrg if (chain->type == CT_COMBINATION)
1807 1.1 mrg {
1808 1.1 mrg tree e1 = get_init_expr (chain->ch1, index);
1809 1.1 mrg tree e2 = get_init_expr (chain->ch2, index);
1810 1.1 mrg
1811 1.1 mrg return fold_build2 (chain->op, chain->rslt_type, e1, e2);
1812 1.1 mrg }
1813 1.1 mrg else
1814 1.1 mrg return chain->inits[index];
1815 1.1 mrg }
1816 1.1 mrg
1817 1.1 mrg /* Returns a new temporary variable used for the I-th variable carrying
1818 1.1 mrg value of REF. The variable's uid is marked in TMP_VARS. */
1819 1.1 mrg
1820 1.1 mrg static tree
1821 1.1 mrg predcom_tmp_var (tree ref, unsigned i, bitmap tmp_vars)
1822 1.1 mrg {
1823 1.1 mrg tree type = TREE_TYPE (ref);
1824 1.1 mrg /* We never access the components of the temporary variable in predictive
1825 1.1 mrg commoning. */
1826 1.1 mrg tree var = create_tmp_reg (type, get_lsm_tmp_name (ref, i));
1827 1.1 mrg bitmap_set_bit (tmp_vars, DECL_UID (var));
1828 1.1 mrg return var;
1829 1.1 mrg }
1830 1.1 mrg
1831 1.1 mrg /* Creates the variables for CHAIN, as well as phi nodes for them and
1832 1.1 mrg initialization on entry to LOOP. Uids of the newly created
1833 1.1 mrg temporary variables are marked in TMP_VARS. */
1834 1.1 mrg
1835 1.1 mrg static void
1836 1.1 mrg initialize_root_vars (class loop *loop, chain_p chain, bitmap tmp_vars)
1837 1.1 mrg {
1838 1.1 mrg unsigned i;
1839 1.1 mrg unsigned n = chain->length;
1840 1.1 mrg dref root = get_chain_root (chain);
1841 1.1 mrg bool reuse_first = !chain->has_max_use_after;
1842 1.1 mrg tree ref, init, var, next;
1843 1.1 mrg gphi *phi;
1844 1.1 mrg gimple_seq stmts;
1845 1.1 mrg edge entry = loop_preheader_edge (loop), latch = loop_latch_edge (loop);
1846 1.1 mrg
1847 1.1 mrg /* If N == 0, then all the references are within the single iteration. And
1848 1.1 mrg since this is an nonempty chain, reuse_first cannot be true. */
1849 1.1 mrg gcc_assert (n > 0 || !reuse_first);
1850 1.1 mrg
1851 1.1 mrg chain->vars.create (n + 1);
1852 1.1 mrg
1853 1.1 mrg if (chain->type == CT_COMBINATION)
1854 1.1 mrg ref = gimple_assign_lhs (root->stmt);
1855 1.1 mrg else
1856 1.1 mrg ref = DR_REF (root->ref);
1857 1.1 mrg
1858 1.1 mrg for (i = 0; i < n + (reuse_first ? 0 : 1); i++)
1859 1.1 mrg {
1860 1.1 mrg var = predcom_tmp_var (ref, i, tmp_vars);
1861 1.1 mrg chain->vars.quick_push (var);
1862 1.1 mrg }
1863 1.1 mrg if (reuse_first)
1864 1.1 mrg chain->vars.quick_push (chain->vars[0]);
1865 1.1 mrg
1866 1.1 mrg FOR_EACH_VEC_ELT (chain->vars, i, var)
1867 1.1 mrg chain->vars[i] = make_ssa_name (var);
1868 1.1 mrg
1869 1.1 mrg for (i = 0; i < n; i++)
1870 1.1 mrg {
1871 1.1 mrg var = chain->vars[i];
1872 1.1 mrg next = chain->vars[i + 1];
1873 1.1 mrg init = get_init_expr (chain, i);
1874 1.1 mrg
1875 1.1 mrg init = force_gimple_operand (init, &stmts, true, NULL_TREE);
1876 1.1 mrg if (stmts)
1877 1.1 mrg gsi_insert_seq_on_edge_immediate (entry, stmts);
1878 1.1 mrg
1879 1.1 mrg phi = create_phi_node (var, loop->header);
1880 1.1 mrg add_phi_arg (phi, init, entry, UNKNOWN_LOCATION);
1881 1.1 mrg add_phi_arg (phi, next, latch, UNKNOWN_LOCATION);
1882 1.1 mrg }
1883 1.1 mrg }
1884 1.1 mrg
1885 1.1 mrg /* For inter-iteration store elimination CHAIN in LOOP, returns true if
1886 1.1 mrg all stores to be eliminated store loop invariant values into memory.
1887 1.1 mrg In this case, we can use these invariant values directly after LOOP. */
1888 1.1 mrg
1889 1.1 mrg static bool
1890 1.1 mrg is_inv_store_elimination_chain (class loop *loop, chain_p chain)
1891 1.1 mrg {
1892 1.1 mrg if (chain->length == 0 || chain->type != CT_STORE_STORE)
1893 1.1 mrg return false;
1894 1.1 mrg
1895 1.1 mrg gcc_assert (!chain->has_max_use_after);
1896 1.1 mrg
1897 1.1 mrg /* If loop iterates for unknown times or fewer times than chain->length,
1898 1.1 mrg we still need to setup root variable and propagate it with PHI node. */
1899 1.1 mrg tree niters = number_of_latch_executions (loop);
1900 1.1 mrg if (TREE_CODE (niters) != INTEGER_CST
1901 1.1 mrg || wi::leu_p (wi::to_wide (niters), chain->length))
1902 1.1 mrg return false;
1903 1.1 mrg
1904 1.1 mrg /* Check stores in chain for elimination if they only store loop invariant
1905 1.1 mrg values. */
1906 1.1 mrg for (unsigned i = 0; i < chain->length; i++)
1907 1.1 mrg {
1908 1.1 mrg dref a = get_chain_last_write_at (chain, i);
1909 1.1 mrg if (a == NULL)
1910 1.1 mrg continue;
1911 1.1 mrg
1912 1.1 mrg gimple *def_stmt, *stmt = a->stmt;
1913 1.1 mrg if (!gimple_assign_single_p (stmt))
1914 1.1 mrg return false;
1915 1.1 mrg
1916 1.1 mrg tree val = gimple_assign_rhs1 (stmt);
1917 1.1 mrg if (TREE_CLOBBER_P (val))
1918 1.1 mrg return false;
1919 1.1 mrg
1920 1.1 mrg if (CONSTANT_CLASS_P (val))
1921 1.1 mrg continue;
1922 1.1 mrg
1923 1.1 mrg if (TREE_CODE (val) != SSA_NAME)
1924 1.1 mrg return false;
1925 1.1 mrg
1926 1.1 mrg def_stmt = SSA_NAME_DEF_STMT (val);
1927 1.1 mrg if (gimple_nop_p (def_stmt))
1928 1.1 mrg continue;
1929 1.1 mrg
1930 1.1 mrg if (flow_bb_inside_loop_p (loop, gimple_bb (def_stmt)))
1931 1.1 mrg return false;
1932 1.1 mrg }
1933 1.1 mrg return true;
1934 1.1 mrg }
1935 1.1 mrg
1936 1.1 mrg /* Creates root variables for store elimination CHAIN in which stores for
1937 1.1 mrg elimination only store loop invariant values. In this case, we neither
1938 1.1 mrg need to load root variables before loop nor propagate it with PHI nodes. */
1939 1.1 mrg
1940 1.1 mrg static void
1941 1.1 mrg initialize_root_vars_store_elim_1 (chain_p chain)
1942 1.1 mrg {
1943 1.1 mrg tree var;
1944 1.1 mrg unsigned i, n = chain->length;
1945 1.1 mrg
1946 1.1 mrg chain->vars.create (n);
1947 1.1 mrg chain->vars.safe_grow_cleared (n, true);
1948 1.1 mrg
1949 1.1 mrg /* Initialize root value for eliminated stores at each distance. */
1950 1.1 mrg for (i = 0; i < n; i++)
1951 1.1 mrg {
1952 1.1 mrg dref a = get_chain_last_write_at (chain, i);
1953 1.1 mrg if (a == NULL)
1954 1.1 mrg continue;
1955 1.1 mrg
1956 1.1 mrg var = gimple_assign_rhs1 (a->stmt);
1957 1.1 mrg chain->vars[a->distance] = var;
1958 1.1 mrg }
1959 1.1 mrg
1960 1.1 mrg /* We don't propagate values with PHI nodes, so manually propagate value
1961 1.1 mrg to bubble positions. */
1962 1.1 mrg var = chain->vars[0];
1963 1.1 mrg for (i = 1; i < n; i++)
1964 1.1 mrg {
1965 1.1 mrg if (chain->vars[i] != NULL_TREE)
1966 1.1 mrg {
1967 1.1 mrg var = chain->vars[i];
1968 1.1 mrg continue;
1969 1.1 mrg }
1970 1.1 mrg chain->vars[i] = var;
1971 1.1 mrg }
1972 1.1 mrg
1973 1.1 mrg /* Revert the vector. */
1974 1.1 mrg for (i = 0; i < n / 2; i++)
1975 1.1 mrg std::swap (chain->vars[i], chain->vars[n - i - 1]);
1976 1.1 mrg }
1977 1.1 mrg
1978 1.1 mrg /* Creates root variables for store elimination CHAIN in which stores for
1979 1.1 mrg elimination store loop variant values. In this case, we may need to
1980 1.1 mrg load root variables before LOOP and propagate it with PHI nodes. Uids
1981 1.1 mrg of the newly created root variables are marked in TMP_VARS. */
1982 1.1 mrg
1983 1.1 mrg static void
1984 1.1 mrg initialize_root_vars_store_elim_2 (class loop *loop,
1985 1.1 mrg chain_p chain, bitmap tmp_vars)
1986 1.1 mrg {
1987 1.1 mrg unsigned i, n = chain->length;
1988 1.1 mrg tree ref, init, var, next, val, phi_result;
1989 1.1 mrg gimple *stmt;
1990 1.1 mrg gimple_seq stmts;
1991 1.1 mrg
1992 1.1 mrg chain->vars.create (n);
1993 1.1 mrg
1994 1.1 mrg ref = DR_REF (get_chain_root (chain)->ref);
1995 1.1 mrg for (i = 0; i < n; i++)
1996 1.1 mrg {
1997 1.1 mrg var = predcom_tmp_var (ref, i, tmp_vars);
1998 1.1 mrg chain->vars.quick_push (var);
1999 1.1 mrg }
2000 1.1 mrg
2001 1.1 mrg FOR_EACH_VEC_ELT (chain->vars, i, var)
2002 1.1 mrg chain->vars[i] = make_ssa_name (var);
2003 1.1 mrg
2004 1.1 mrg /* Root values are either rhs operand of stores to be eliminated, or
2005 1.1 mrg loaded from memory before loop. */
2006 1.1 mrg auto_vec<tree> vtemps;
2007 1.1 mrg vtemps.safe_grow_cleared (n, true);
2008 1.1 mrg for (i = 0; i < n; i++)
2009 1.1 mrg {
2010 1.1 mrg init = get_init_expr (chain, i);
2011 1.1 mrg if (init == NULL_TREE)
2012 1.1 mrg {
2013 1.1 mrg /* Root value is rhs operand of the store to be eliminated if
2014 1.1 mrg it isn't loaded from memory before loop. */
2015 1.1 mrg dref a = get_chain_last_write_at (chain, i);
2016 1.1 mrg val = gimple_assign_rhs1 (a->stmt);
2017 1.1 mrg if (TREE_CLOBBER_P (val))
2018 1.1 mrg {
2019 1.1 mrg val = get_or_create_ssa_default_def (cfun, SSA_NAME_VAR (var));
2020 1.1 mrg gimple_assign_set_rhs1 (a->stmt, val);
2021 1.1 mrg }
2022 1.1 mrg
2023 1.1 mrg vtemps[n - i - 1] = val;
2024 1.1 mrg }
2025 1.1 mrg else
2026 1.1 mrg {
2027 1.1 mrg edge latch = loop_latch_edge (loop);
2028 1.1 mrg edge entry = loop_preheader_edge (loop);
2029 1.1 mrg
2030 1.1 mrg /* Root value is loaded from memory before loop, we also need
2031 1.1 mrg to add PHI nodes to propagate the value across iterations. */
2032 1.1 mrg init = force_gimple_operand (init, &stmts, true, NULL_TREE);
2033 1.1 mrg if (stmts)
2034 1.1 mrg gsi_insert_seq_on_edge_immediate (entry, stmts);
2035 1.1 mrg
2036 1.1 mrg next = chain->vars[n - i];
2037 1.1 mrg phi_result = copy_ssa_name (next);
2038 1.1 mrg gphi *phi = create_phi_node (phi_result, loop->header);
2039 1.1 mrg add_phi_arg (phi, init, entry, UNKNOWN_LOCATION);
2040 1.1 mrg add_phi_arg (phi, next, latch, UNKNOWN_LOCATION);
2041 1.1 mrg vtemps[n - i - 1] = phi_result;
2042 1.1 mrg }
2043 1.1 mrg }
2044 1.1 mrg
2045 1.1 mrg /* Find the insertion position. */
2046 1.1 mrg dref last = get_chain_root (chain);
2047 1.1 mrg for (i = 0; i < chain->refs.length (); i++)
2048 1.1 mrg {
2049 1.1 mrg if (chain->refs[i]->pos > last->pos)
2050 1.1 mrg last = chain->refs[i];
2051 1.1 mrg }
2052 1.1 mrg
2053 1.1 mrg gimple_stmt_iterator gsi = gsi_for_stmt (last->stmt);
2054 1.1 mrg
2055 1.1 mrg /* Insert statements copying root value to root variable. */
2056 1.1 mrg for (i = 0; i < n; i++)
2057 1.1 mrg {
2058 1.1 mrg var = chain->vars[i];
2059 1.1 mrg val = vtemps[i];
2060 1.1 mrg stmt = gimple_build_assign (var, val);
2061 1.1 mrg gsi_insert_after (&gsi, stmt, GSI_NEW_STMT);
2062 1.1 mrg }
2063 1.1 mrg }
2064 1.1 mrg
2065 1.1 mrg /* Generates stores for CHAIN's eliminated stores in LOOP's last
2066 1.1 mrg (CHAIN->length - 1) iterations. */
2067 1.1 mrg
2068 1.1 mrg static void
2069 1.1 mrg finalize_eliminated_stores (class loop *loop, chain_p chain)
2070 1.1 mrg {
2071 1.1 mrg unsigned i, n = chain->length;
2072 1.1 mrg
2073 1.1 mrg for (i = 0; i < n; i++)
2074 1.1 mrg {
2075 1.1 mrg tree var = chain->vars[i];
2076 1.1 mrg tree fini = chain->finis[n - i - 1];
2077 1.1 mrg gimple *stmt = gimple_build_assign (fini, var);
2078 1.1 mrg
2079 1.1 mrg gimple_seq_add_stmt_without_update (&chain->fini_seq, stmt);
2080 1.1 mrg }
2081 1.1 mrg
2082 1.1 mrg if (chain->fini_seq)
2083 1.1 mrg {
2084 1.1 mrg gsi_insert_seq_on_edge_immediate (single_exit (loop), chain->fini_seq);
2085 1.1 mrg chain->fini_seq = NULL;
2086 1.1 mrg }
2087 1.1 mrg }
2088 1.1 mrg
2089 1.1 mrg /* Initializes a variable for load motion for ROOT and prepares phi nodes and
2090 1.1 mrg initialization on entry to LOOP if necessary. The ssa name for the variable
2091 1.1 mrg is stored in VARS. If WRITTEN is true, also a phi node to copy its value
2092 1.1 mrg around the loop is created. Uid of the newly created temporary variable
2093 1.1 mrg is marked in TMP_VARS. INITS is the list containing the (single)
2094 1.1 mrg initializer. */
2095 1.1 mrg
2096 1.1 mrg static void
2097 1.1 mrg initialize_root_vars_lm (class loop *loop, dref root, bool written,
2098 1.1 mrg vec<tree> *vars, const vec<tree> &inits,
2099 1.1 mrg bitmap tmp_vars)
2100 1.1 mrg {
2101 1.1 mrg unsigned i;
2102 1.1 mrg tree ref = DR_REF (root->ref), init, var, next;
2103 1.1 mrg gimple_seq stmts;
2104 1.1 mrg gphi *phi;
2105 1.1 mrg edge entry = loop_preheader_edge (loop), latch = loop_latch_edge (loop);
2106 1.1 mrg
2107 1.1 mrg /* Find the initializer for the variable, and check that it cannot
2108 1.1 mrg trap. */
2109 1.1 mrg init = inits[0];
2110 1.1 mrg
2111 1.1 mrg vars->create (written ? 2 : 1);
2112 1.1 mrg var = predcom_tmp_var (ref, 0, tmp_vars);
2113 1.1 mrg vars->quick_push (var);
2114 1.1 mrg if (written)
2115 1.1 mrg vars->quick_push ((*vars)[0]);
2116 1.1 mrg
2117 1.1 mrg FOR_EACH_VEC_ELT (*vars, i, var)
2118 1.1 mrg (*vars)[i] = make_ssa_name (var);
2119 1.1 mrg
2120 1.1 mrg var = (*vars)[0];
2121 1.1 mrg
2122 1.1 mrg init = force_gimple_operand (init, &stmts, written, NULL_TREE);
2123 1.1 mrg if (stmts)
2124 1.1 mrg gsi_insert_seq_on_edge_immediate (entry, stmts);
2125 1.1 mrg
2126 1.1 mrg if (written)
2127 1.1 mrg {
2128 1.1 mrg next = (*vars)[1];
2129 1.1 mrg phi = create_phi_node (var, loop->header);
2130 1.1 mrg add_phi_arg (phi, init, entry, UNKNOWN_LOCATION);
2131 1.1 mrg add_phi_arg (phi, next, latch, UNKNOWN_LOCATION);
2132 1.1 mrg }
2133 1.1 mrg else
2134 1.1 mrg {
2135 1.1 mrg gassign *init_stmt = gimple_build_assign (var, init);
2136 1.1 mrg gsi_insert_on_edge_immediate (entry, init_stmt);
2137 1.1 mrg }
2138 1.1 mrg }
2139 1.1 mrg
2140 1.1 mrg
2141 1.1 mrg /* Execute load motion for references in chain CHAIN. Uids of the newly
2142 1.1 mrg created temporary variables are marked in TMP_VARS. */
2143 1.1 mrg
2144 1.1 mrg static void
2145 1.1 mrg execute_load_motion (class loop *loop, chain_p chain, bitmap tmp_vars)
2146 1.1 mrg {
2147 1.1 mrg auto_vec<tree> vars;
2148 1.1 mrg dref a;
2149 1.1 mrg unsigned n_writes = 0, ridx, i;
2150 1.1 mrg tree var;
2151 1.1 mrg
2152 1.1 mrg gcc_assert (chain->type == CT_INVARIANT);
2153 1.1 mrg gcc_assert (!chain->combined);
2154 1.1 mrg FOR_EACH_VEC_ELT (chain->refs, i, a)
2155 1.1 mrg if (DR_IS_WRITE (a->ref))
2156 1.1 mrg n_writes++;
2157 1.1 mrg
2158 1.1 mrg /* If there are no reads in the loop, there is nothing to do. */
2159 1.1 mrg if (n_writes == chain->refs.length ())
2160 1.1 mrg return;
2161 1.1 mrg
2162 1.1 mrg initialize_root_vars_lm (loop, get_chain_root (chain), n_writes > 0,
2163 1.1 mrg &vars, chain->inits, tmp_vars);
2164 1.1 mrg
2165 1.1 mrg ridx = 0;
2166 1.1 mrg FOR_EACH_VEC_ELT (chain->refs, i, a)
2167 1.1 mrg {
2168 1.1 mrg bool is_read = DR_IS_READ (a->ref);
2169 1.1 mrg
2170 1.1 mrg if (DR_IS_WRITE (a->ref))
2171 1.1 mrg {
2172 1.1 mrg n_writes--;
2173 1.1 mrg if (n_writes)
2174 1.1 mrg {
2175 1.1 mrg var = vars[0];
2176 1.1 mrg var = make_ssa_name (SSA_NAME_VAR (var));
2177 1.1 mrg vars[0] = var;
2178 1.1 mrg }
2179 1.1 mrg else
2180 1.1 mrg ridx = 1;
2181 1.1 mrg }
2182 1.1 mrg
2183 1.1 mrg replace_ref_with (a->stmt, vars[ridx],
2184 1.1 mrg !is_read, !is_read);
2185 1.1 mrg }
2186 1.1 mrg }
2187 1.1 mrg
2188 1.1 mrg /* Returns the single statement in that NAME is used, excepting
2189 1.1 mrg the looparound phi nodes contained in one of the chains. If there is no
2190 1.1 mrg such statement, or more statements, NULL is returned. */
2191 1.1 mrg
2192 1.1 mrg gimple *
2193 1.1 mrg pcom_worker::single_nonlooparound_use (tree name)
2194 1.1 mrg {
2195 1.1 mrg use_operand_p use;
2196 1.1 mrg imm_use_iterator it;
2197 1.1 mrg gimple *stmt, *ret = NULL;
2198 1.1 mrg
2199 1.1 mrg FOR_EACH_IMM_USE_FAST (use, it, name)
2200 1.1 mrg {
2201 1.1 mrg stmt = USE_STMT (use);
2202 1.1 mrg
2203 1.1 mrg if (gimple_code (stmt) == GIMPLE_PHI)
2204 1.1 mrg {
2205 1.1 mrg /* Ignore uses in looparound phi nodes. Uses in other phi nodes
2206 1.1 mrg could not be processed anyway, so just fail for them. */
2207 1.1 mrg if (bitmap_bit_p (m_looparound_phis,
2208 1.1 mrg SSA_NAME_VERSION (PHI_RESULT (stmt))))
2209 1.1 mrg continue;
2210 1.1 mrg
2211 1.1 mrg return NULL;
2212 1.1 mrg }
2213 1.1 mrg else if (is_gimple_debug (stmt))
2214 1.1 mrg continue;
2215 1.1 mrg else if (ret != NULL)
2216 1.1 mrg return NULL;
2217 1.1 mrg else
2218 1.1 mrg ret = stmt;
2219 1.1 mrg }
2220 1.1 mrg
2221 1.1 mrg return ret;
2222 1.1 mrg }
2223 1.1 mrg
2224 1.1 mrg /* Remove statement STMT, as well as the chain of assignments in that it is
2225 1.1 mrg used. */
2226 1.1 mrg
2227 1.1 mrg void
2228 1.1 mrg pcom_worker::remove_stmt (gimple *stmt)
2229 1.1 mrg {
2230 1.1 mrg tree name;
2231 1.1 mrg gimple *next;
2232 1.1 mrg gimple_stmt_iterator psi;
2233 1.1 mrg
2234 1.1 mrg if (gimple_code (stmt) == GIMPLE_PHI)
2235 1.1 mrg {
2236 1.1 mrg name = PHI_RESULT (stmt);
2237 1.1 mrg next = single_nonlooparound_use (name);
2238 1.1 mrg reset_debug_uses (stmt);
2239 1.1 mrg psi = gsi_for_stmt (stmt);
2240 1.1 mrg remove_phi_node (&psi, true);
2241 1.1 mrg
2242 1.1 mrg if (!next
2243 1.1 mrg || !gimple_assign_ssa_name_copy_p (next)
2244 1.1 mrg || gimple_assign_rhs1 (next) != name)
2245 1.1 mrg return;
2246 1.1 mrg
2247 1.1 mrg stmt = next;
2248 1.1 mrg }
2249 1.1 mrg
2250 1.1 mrg while (1)
2251 1.1 mrg {
2252 1.1 mrg gimple_stmt_iterator bsi;
2253 1.1 mrg
2254 1.1 mrg bsi = gsi_for_stmt (stmt);
2255 1.1 mrg
2256 1.1 mrg name = gimple_assign_lhs (stmt);
2257 1.1 mrg if (TREE_CODE (name) == SSA_NAME)
2258 1.1 mrg {
2259 1.1 mrg next = single_nonlooparound_use (name);
2260 1.1 mrg reset_debug_uses (stmt);
2261 1.1 mrg }
2262 1.1 mrg else
2263 1.1 mrg {
2264 1.1 mrg /* This is a store to be eliminated. */
2265 1.1 mrg gcc_assert (gimple_vdef (stmt) != NULL);
2266 1.1 mrg next = NULL;
2267 1.1 mrg }
2268 1.1 mrg
2269 1.1 mrg unlink_stmt_vdef (stmt);
2270 1.1 mrg gsi_remove (&bsi, true);
2271 1.1 mrg release_defs (stmt);
2272 1.1 mrg
2273 1.1 mrg if (!next
2274 1.1 mrg || !gimple_assign_ssa_name_copy_p (next)
2275 1.1 mrg || gimple_assign_rhs1 (next) != name)
2276 1.1 mrg return;
2277 1.1 mrg
2278 1.1 mrg stmt = next;
2279 1.1 mrg }
2280 1.1 mrg }
2281 1.1 mrg
2282 1.1 mrg /* Perform the predictive commoning optimization for a chain CHAIN.
2283 1.1 mrg Uids of the newly created temporary variables are marked in TMP_VARS.*/
2284 1.1 mrg
2285 1.1 mrg void
2286 1.1 mrg pcom_worker::execute_pred_commoning_chain (chain_p chain,
2287 1.1 mrg bitmap tmp_vars)
2288 1.1 mrg {
2289 1.1 mrg unsigned i;
2290 1.1 mrg dref a;
2291 1.1 mrg tree var;
2292 1.1 mrg bool in_lhs;
2293 1.1 mrg
2294 1.1 mrg if (chain->combined)
2295 1.1 mrg {
2296 1.1 mrg /* For combined chains, just remove the statements that are used to
2297 1.1 mrg compute the values of the expression (except for the root one).
2298 1.1 mrg We delay this until after all chains are processed. */
2299 1.1 mrg }
2300 1.1 mrg else if (chain->type == CT_STORE_STORE)
2301 1.1 mrg {
2302 1.1 mrg if (chain->length > 0)
2303 1.1 mrg {
2304 1.1 mrg if (chain->inv_store_elimination)
2305 1.1 mrg {
2306 1.1 mrg /* If dead stores in this chain only store loop invariant
2307 1.1 mrg values, we can simply record the invariant value and use
2308 1.1 mrg it directly after loop. */
2309 1.1 mrg initialize_root_vars_store_elim_1 (chain);
2310 1.1 mrg }
2311 1.1 mrg else
2312 1.1 mrg {
2313 1.1 mrg /* If dead stores in this chain store loop variant values,
2314 1.1 mrg we need to set up the variables by loading from memory
2315 1.1 mrg before loop and propagating it with PHI nodes. */
2316 1.1 mrg initialize_root_vars_store_elim_2 (m_loop, chain, tmp_vars);
2317 1.1 mrg }
2318 1.1 mrg
2319 1.1 mrg /* For inter-iteration store elimination chain, stores at each
2320 1.1 mrg distance in loop's last (chain->length - 1) iterations can't
2321 1.1 mrg be eliminated, because there is no following killing store.
2322 1.1 mrg We need to generate these stores after loop. */
2323 1.1 mrg finalize_eliminated_stores (m_loop, chain);
2324 1.1 mrg }
2325 1.1 mrg
2326 1.1 mrg bool last_store_p = true;
2327 1.1 mrg for (i = chain->refs.length (); i > 0; i--)
2328 1.1 mrg {
2329 1.1 mrg a = chain->refs[i - 1];
2330 1.1 mrg /* Preserve the last store of the chain. Eliminate other stores
2331 1.1 mrg which are killed by the last one. */
2332 1.1 mrg if (DR_IS_WRITE (a->ref))
2333 1.1 mrg {
2334 1.1 mrg if (last_store_p)
2335 1.1 mrg last_store_p = false;
2336 1.1 mrg else
2337 1.1 mrg remove_stmt (a->stmt);
2338 1.1 mrg
2339 1.1 mrg continue;
2340 1.1 mrg }
2341 1.1 mrg
2342 1.1 mrg /* Any load in Store-Store chain must be dominated by a previous
2343 1.1 mrg store, we replace the load reference with rhs of the store. */
2344 1.1 mrg dref b = get_chain_last_write_before_load (chain, i - 1);
2345 1.1 mrg gcc_assert (b != NULL);
2346 1.1 mrg var = gimple_assign_rhs1 (b->stmt);
2347 1.1 mrg replace_ref_with (a->stmt, var, false, false);
2348 1.1 mrg }
2349 1.1 mrg }
2350 1.1 mrg else
2351 1.1 mrg {
2352 1.1 mrg /* For non-combined chains, set up the variables that hold its value. */
2353 1.1 mrg initialize_root_vars (m_loop, chain, tmp_vars);
2354 1.1 mrg a = get_chain_root (chain);
2355 1.1 mrg in_lhs = (chain->type == CT_STORE_LOAD
2356 1.1 mrg || chain->type == CT_COMBINATION);
2357 1.1 mrg replace_ref_with (a->stmt, chain->vars[chain->length], true, in_lhs);
2358 1.1 mrg
2359 1.1 mrg /* Replace the uses of the original references by these variables. */
2360 1.1 mrg for (i = 1; chain->refs.iterate (i, &a); i++)
2361 1.1 mrg {
2362 1.1 mrg var = chain->vars[chain->length - a->distance];
2363 1.1 mrg replace_ref_with (a->stmt, var, false, false);
2364 1.1 mrg }
2365 1.1 mrg }
2366 1.1 mrg }
2367 1.1 mrg
2368 1.1 mrg /* Determines the unroll factor necessary to remove as many temporary variable
2369 1.1 mrg copies as possible. CHAINS is the list of chains that will be
2370 1.1 mrg optimized. */
2371 1.1 mrg
2372 1.1 mrg static unsigned
2373 1.1 mrg determine_unroll_factor (const vec<chain_p> &chains)
2374 1.1 mrg {
2375 1.1 mrg chain_p chain;
2376 1.1 mrg unsigned factor = 1, af, nfactor, i;
2377 1.1 mrg unsigned max = param_max_unroll_times;
2378 1.1 mrg
2379 1.1 mrg FOR_EACH_VEC_ELT (chains, i, chain)
2380 1.1 mrg {
2381 1.1 mrg if (chain->type == CT_INVARIANT)
2382 1.1 mrg continue;
2383 1.1 mrg /* For now we can't handle unrolling when eliminating stores. */
2384 1.1 mrg else if (chain->type == CT_STORE_STORE)
2385 1.1 mrg return 1;
2386 1.1 mrg
2387 1.1 mrg if (chain->combined)
2388 1.1 mrg {
2389 1.1 mrg /* For combined chains, we can't handle unrolling if we replace
2390 1.1 mrg looparound PHIs. */
2391 1.1 mrg dref a;
2392 1.1 mrg unsigned j;
2393 1.1 mrg for (j = 1; chain->refs.iterate (j, &a); j++)
2394 1.1 mrg if (gimple_code (a->stmt) == GIMPLE_PHI)
2395 1.1 mrg return 1;
2396 1.1 mrg continue;
2397 1.1 mrg }
2398 1.1 mrg
2399 1.1 mrg /* The best unroll factor for this chain is equal to the number of
2400 1.1 mrg temporary variables that we create for it. */
2401 1.1 mrg af = chain->length;
2402 1.1 mrg if (chain->has_max_use_after)
2403 1.1 mrg af++;
2404 1.1 mrg
2405 1.1 mrg nfactor = factor * af / gcd (factor, af);
2406 1.1 mrg if (nfactor <= max)
2407 1.1 mrg factor = nfactor;
2408 1.1 mrg }
2409 1.1 mrg
2410 1.1 mrg return factor;
2411 1.1 mrg }
2412 1.1 mrg
2413 1.1 mrg /* Perform the predictive commoning optimization for chains.
2414 1.1 mrg Uids of the newly created temporary variables are marked in TMP_VARS. */
2415 1.1 mrg
2416 1.1 mrg void
2417 1.1 mrg pcom_worker::execute_pred_commoning (bitmap tmp_vars)
2418 1.1 mrg {
2419 1.1 mrg chain_p chain;
2420 1.1 mrg unsigned i;
2421 1.1 mrg
2422 1.1 mrg FOR_EACH_VEC_ELT (m_chains, i, chain)
2423 1.1 mrg {
2424 1.1 mrg if (chain->type == CT_INVARIANT)
2425 1.1 mrg execute_load_motion (m_loop, chain, tmp_vars);
2426 1.1 mrg else
2427 1.1 mrg execute_pred_commoning_chain (chain, tmp_vars);
2428 1.1 mrg }
2429 1.1 mrg
2430 1.1 mrg FOR_EACH_VEC_ELT (m_chains, i, chain)
2431 1.1 mrg {
2432 1.1 mrg if (chain->type == CT_INVARIANT)
2433 1.1 mrg ;
2434 1.1 mrg else if (chain->combined)
2435 1.1 mrg {
2436 1.1 mrg /* For combined chains, just remove the statements that are used to
2437 1.1 mrg compute the values of the expression (except for the root one). */
2438 1.1 mrg dref a;
2439 1.1 mrg unsigned j;
2440 1.1 mrg for (j = 1; chain->refs.iterate (j, &a); j++)
2441 1.1 mrg remove_stmt (a->stmt);
2442 1.1 mrg }
2443 1.1 mrg }
2444 1.1 mrg }
2445 1.1 mrg
2446 1.1 mrg /* For each reference in CHAINS, if its defining statement is
2447 1.1 mrg phi node, record the ssa name that is defined by it. */
2448 1.1 mrg
2449 1.1 mrg static void
2450 1.1 mrg replace_phis_by_defined_names (vec<chain_p> &chains)
2451 1.1 mrg {
2452 1.1 mrg chain_p chain;
2453 1.1 mrg dref a;
2454 1.1 mrg unsigned i, j;
2455 1.1 mrg
2456 1.1 mrg FOR_EACH_VEC_ELT (chains, i, chain)
2457 1.1 mrg FOR_EACH_VEC_ELT (chain->refs, j, a)
2458 1.1 mrg {
2459 1.1 mrg if (gimple_code (a->stmt) == GIMPLE_PHI)
2460 1.1 mrg {
2461 1.1 mrg a->name_defined_by_phi = PHI_RESULT (a->stmt);
2462 1.1 mrg a->stmt = NULL;
2463 1.1 mrg }
2464 1.1 mrg }
2465 1.1 mrg }
2466 1.1 mrg
2467 1.1 mrg /* For each reference in CHAINS, if name_defined_by_phi is not
2468 1.1 mrg NULL, use it to set the stmt field. */
2469 1.1 mrg
2470 1.1 mrg static void
2471 1.1 mrg replace_names_by_phis (vec<chain_p> chains)
2472 1.1 mrg {
2473 1.1 mrg chain_p chain;
2474 1.1 mrg dref a;
2475 1.1 mrg unsigned i, j;
2476 1.1 mrg
2477 1.1 mrg FOR_EACH_VEC_ELT (chains, i, chain)
2478 1.1 mrg FOR_EACH_VEC_ELT (chain->refs, j, a)
2479 1.1 mrg if (a->stmt == NULL)
2480 1.1 mrg {
2481 1.1 mrg a->stmt = SSA_NAME_DEF_STMT (a->name_defined_by_phi);
2482 1.1 mrg gcc_assert (gimple_code (a->stmt) == GIMPLE_PHI);
2483 1.1 mrg a->name_defined_by_phi = NULL_TREE;
2484 1.1 mrg }
2485 1.1 mrg }
2486 1.1 mrg
2487 1.1 mrg /* Wrapper over execute_pred_commoning, to pass it as a callback
2488 1.1 mrg to tree_transform_and_unroll_loop. */
2489 1.1 mrg
2490 1.1 mrg struct epcc_data
2491 1.1 mrg {
2492 1.1 mrg vec<chain_p> chains;
2493 1.1 mrg bitmap tmp_vars;
2494 1.1 mrg pcom_worker *worker;
2495 1.1 mrg };
2496 1.1 mrg
2497 1.1 mrg static void
2498 1.1 mrg execute_pred_commoning_cbck (class loop *loop ATTRIBUTE_UNUSED, void *data)
2499 1.1 mrg {
2500 1.1 mrg struct epcc_data *const dta = (struct epcc_data *) data;
2501 1.1 mrg pcom_worker *worker = dta->worker;
2502 1.1 mrg
2503 1.1 mrg /* Restore phi nodes that were replaced by ssa names before
2504 1.1 mrg tree_transform_and_unroll_loop (see detailed description in
2505 1.1 mrg tree_predictive_commoning_loop). */
2506 1.1 mrg replace_names_by_phis (dta->chains);
2507 1.1 mrg worker->execute_pred_commoning (dta->tmp_vars);
2508 1.1 mrg }
2509 1.1 mrg
2510 1.1 mrg /* Base NAME and all the names in the chain of phi nodes that use it
2511 1.1 mrg on variable VAR. The phi nodes are recognized by being in the copies of
2512 1.1 mrg the header of the LOOP. */
2513 1.1 mrg
2514 1.1 mrg static void
2515 1.1 mrg base_names_in_chain_on (class loop *loop, tree name, tree var)
2516 1.1 mrg {
2517 1.1 mrg gimple *stmt, *phi;
2518 1.1 mrg imm_use_iterator iter;
2519 1.1 mrg
2520 1.1 mrg replace_ssa_name_symbol (name, var);
2521 1.1 mrg
2522 1.1 mrg while (1)
2523 1.1 mrg {
2524 1.1 mrg phi = NULL;
2525 1.1 mrg FOR_EACH_IMM_USE_STMT (stmt, iter, name)
2526 1.1 mrg {
2527 1.1 mrg if (gimple_code (stmt) == GIMPLE_PHI
2528 1.1 mrg && flow_bb_inside_loop_p (loop, gimple_bb (stmt)))
2529 1.1 mrg {
2530 1.1 mrg phi = stmt;
2531 1.1 mrg break;
2532 1.1 mrg }
2533 1.1 mrg }
2534 1.1 mrg if (!phi)
2535 1.1 mrg return;
2536 1.1 mrg
2537 1.1 mrg name = PHI_RESULT (phi);
2538 1.1 mrg replace_ssa_name_symbol (name, var);
2539 1.1 mrg }
2540 1.1 mrg }
2541 1.1 mrg
2542 1.1 mrg /* Given an unrolled LOOP after predictive commoning, remove the
2543 1.1 mrg register copies arising from phi nodes by changing the base
2544 1.1 mrg variables of SSA names. TMP_VARS is the set of the temporary variables
2545 1.1 mrg for those we want to perform this. */
2546 1.1 mrg
2547 1.1 mrg static void
2548 1.1 mrg eliminate_temp_copies (class loop *loop, bitmap tmp_vars)
2549 1.1 mrg {
2550 1.1 mrg edge e;
2551 1.1 mrg gphi *phi;
2552 1.1 mrg gimple *stmt;
2553 1.1 mrg tree name, use, var;
2554 1.1 mrg gphi_iterator psi;
2555 1.1 mrg
2556 1.1 mrg e = loop_latch_edge (loop);
2557 1.1 mrg for (psi = gsi_start_phis (loop->header); !gsi_end_p (psi); gsi_next (&psi))
2558 1.1 mrg {
2559 1.1 mrg phi = psi.phi ();
2560 1.1 mrg name = PHI_RESULT (phi);
2561 1.1 mrg var = SSA_NAME_VAR (name);
2562 1.1 mrg if (!var || !bitmap_bit_p (tmp_vars, DECL_UID (var)))
2563 1.1 mrg continue;
2564 1.1 mrg use = PHI_ARG_DEF_FROM_EDGE (phi, e);
2565 1.1 mrg gcc_assert (TREE_CODE (use) == SSA_NAME);
2566 1.1 mrg
2567 1.1 mrg /* Base all the ssa names in the ud and du chain of NAME on VAR. */
2568 1.1 mrg stmt = SSA_NAME_DEF_STMT (use);
2569 1.1 mrg while (gimple_code (stmt) == GIMPLE_PHI
2570 1.1 mrg /* In case we could not unroll the loop enough to eliminate
2571 1.1 mrg all copies, we may reach the loop header before the defining
2572 1.1 mrg statement (in that case, some register copies will be present
2573 1.1 mrg in loop latch in the final code, corresponding to the newly
2574 1.1 mrg created looparound phi nodes). */
2575 1.1 mrg && gimple_bb (stmt) != loop->header)
2576 1.1 mrg {
2577 1.1 mrg gcc_assert (single_pred_p (gimple_bb (stmt)));
2578 1.1 mrg use = PHI_ARG_DEF (stmt, 0);
2579 1.1 mrg stmt = SSA_NAME_DEF_STMT (use);
2580 1.1 mrg }
2581 1.1 mrg
2582 1.1 mrg base_names_in_chain_on (loop, use, var);
2583 1.1 mrg }
2584 1.1 mrg }
2585 1.1 mrg
2586 1.1 mrg /* Returns true if CHAIN is suitable to be combined. */
2587 1.1 mrg
2588 1.1 mrg static bool
2589 1.1 mrg chain_can_be_combined_p (chain_p chain)
2590 1.1 mrg {
2591 1.1 mrg return (!chain->combined
2592 1.1 mrg && (chain->type == CT_LOAD || chain->type == CT_COMBINATION));
2593 1.1 mrg }
2594 1.1 mrg
2595 1.1 mrg /* Returns the modify statement that uses NAME. Skips over assignment
2596 1.1 mrg statements, NAME is replaced with the actual name used in the returned
2597 1.1 mrg statement. */
2598 1.1 mrg
2599 1.1 mrg gimple *
2600 1.1 mrg pcom_worker::find_use_stmt (tree *name)
2601 1.1 mrg {
2602 1.1 mrg gimple *stmt;
2603 1.1 mrg tree rhs, lhs;
2604 1.1 mrg
2605 1.1 mrg /* Skip over assignments. */
2606 1.1 mrg while (1)
2607 1.1 mrg {
2608 1.1 mrg stmt = single_nonlooparound_use (*name);
2609 1.1 mrg if (!stmt)
2610 1.1 mrg return NULL;
2611 1.1 mrg
2612 1.1 mrg if (gimple_code (stmt) != GIMPLE_ASSIGN)
2613 1.1 mrg return NULL;
2614 1.1 mrg
2615 1.1 mrg lhs = gimple_assign_lhs (stmt);
2616 1.1 mrg if (TREE_CODE (lhs) != SSA_NAME)
2617 1.1 mrg return NULL;
2618 1.1 mrg
2619 1.1 mrg if (gimple_assign_copy_p (stmt))
2620 1.1 mrg {
2621 1.1 mrg rhs = gimple_assign_rhs1 (stmt);
2622 1.1 mrg if (rhs != *name)
2623 1.1 mrg return NULL;
2624 1.1 mrg
2625 1.1 mrg *name = lhs;
2626 1.1 mrg }
2627 1.1 mrg else if (get_gimple_rhs_class (gimple_assign_rhs_code (stmt))
2628 1.1 mrg == GIMPLE_BINARY_RHS)
2629 1.1 mrg return stmt;
2630 1.1 mrg else
2631 1.1 mrg return NULL;
2632 1.1 mrg }
2633 1.1 mrg }
2634 1.1 mrg
2635 1.1 mrg /* Returns true if we may perform reassociation for operation CODE in TYPE. */
2636 1.1 mrg
2637 1.1 mrg static bool
2638 1.1 mrg may_reassociate_p (tree type, enum tree_code code)
2639 1.1 mrg {
2640 1.1 mrg if (FLOAT_TYPE_P (type)
2641 1.1 mrg && !flag_unsafe_math_optimizations)
2642 1.1 mrg return false;
2643 1.1 mrg
2644 1.1 mrg return (commutative_tree_code (code)
2645 1.1 mrg && associative_tree_code (code));
2646 1.1 mrg }
2647 1.1 mrg
2648 1.1 mrg /* If the operation used in STMT is associative and commutative, go through the
2649 1.1 mrg tree of the same operations and returns its root. Distance to the root
2650 1.1 mrg is stored in DISTANCE. */
2651 1.1 mrg
2652 1.1 mrg gimple *
2653 1.1 mrg pcom_worker::find_associative_operation_root (gimple *stmt, unsigned *distance)
2654 1.1 mrg {
2655 1.1 mrg tree lhs;
2656 1.1 mrg gimple *next;
2657 1.1 mrg enum tree_code code = gimple_assign_rhs_code (stmt);
2658 1.1 mrg tree type = TREE_TYPE (gimple_assign_lhs (stmt));
2659 1.1 mrg unsigned dist = 0;
2660 1.1 mrg
2661 1.1 mrg if (!may_reassociate_p (type, code))
2662 1.1 mrg return NULL;
2663 1.1 mrg
2664 1.1 mrg while (1)
2665 1.1 mrg {
2666 1.1 mrg lhs = gimple_assign_lhs (stmt);
2667 1.1 mrg gcc_assert (TREE_CODE (lhs) == SSA_NAME);
2668 1.1 mrg
2669 1.1 mrg next = find_use_stmt (&lhs);
2670 1.1 mrg if (!next
2671 1.1 mrg || gimple_assign_rhs_code (next) != code)
2672 1.1 mrg break;
2673 1.1 mrg
2674 1.1 mrg stmt = next;
2675 1.1 mrg dist++;
2676 1.1 mrg }
2677 1.1 mrg
2678 1.1 mrg if (distance)
2679 1.1 mrg *distance = dist;
2680 1.1 mrg return stmt;
2681 1.1 mrg }
2682 1.1 mrg
2683 1.1 mrg /* Returns the common statement in that NAME1 and NAME2 have a use. If there
2684 1.1 mrg is no such statement, returns NULL_TREE. In case the operation used on
2685 1.1 mrg NAME1 and NAME2 is associative and commutative, returns the root of the
2686 1.1 mrg tree formed by this operation instead of the statement that uses NAME1 or
2687 1.1 mrg NAME2. */
2688 1.1 mrg
2689 1.1 mrg gimple *
2690 1.1 mrg pcom_worker::find_common_use_stmt (tree *name1, tree *name2)
2691 1.1 mrg {
2692 1.1 mrg gimple *stmt1, *stmt2;
2693 1.1 mrg
2694 1.1 mrg stmt1 = find_use_stmt (name1);
2695 1.1 mrg if (!stmt1)
2696 1.1 mrg return NULL;
2697 1.1 mrg
2698 1.1 mrg stmt2 = find_use_stmt (name2);
2699 1.1 mrg if (!stmt2)
2700 1.1 mrg return NULL;
2701 1.1 mrg
2702 1.1 mrg if (stmt1 == stmt2)
2703 1.1 mrg return stmt1;
2704 1.1 mrg
2705 1.1 mrg stmt1 = find_associative_operation_root (stmt1, NULL);
2706 1.1 mrg if (!stmt1)
2707 1.1 mrg return NULL;
2708 1.1 mrg stmt2 = find_associative_operation_root (stmt2, NULL);
2709 1.1 mrg if (!stmt2)
2710 1.1 mrg return NULL;
2711 1.1 mrg
2712 1.1 mrg return (stmt1 == stmt2 ? stmt1 : NULL);
2713 1.1 mrg }
2714 1.1 mrg
2715 1.1 mrg /* Checks whether R1 and R2 are combined together using CODE, with the result
2716 1.1 mrg in RSLT_TYPE, in order R1 CODE R2 if SWAP is false and in order R2 CODE R1
2717 1.1 mrg if it is true. If CODE is ERROR_MARK, set these values instead. */
2718 1.1 mrg
2719 1.1 mrg bool
2720 1.1 mrg pcom_worker::combinable_refs_p (dref r1, dref r2,
2721 1.1 mrg enum tree_code *code, bool *swap, tree *rslt_type)
2722 1.1 mrg {
2723 1.1 mrg enum tree_code acode;
2724 1.1 mrg bool aswap;
2725 1.1 mrg tree atype;
2726 1.1 mrg tree name1, name2;
2727 1.1 mrg gimple *stmt;
2728 1.1 mrg
2729 1.1 mrg name1 = name_for_ref (r1);
2730 1.1 mrg name2 = name_for_ref (r2);
2731 1.1 mrg gcc_assert (name1 != NULL_TREE && name2 != NULL_TREE);
2732 1.1 mrg
2733 1.1 mrg stmt = find_common_use_stmt (&name1, &name2);
2734 1.1 mrg
2735 1.1 mrg if (!stmt
2736 1.1 mrg /* A simple post-dominance check - make sure the combination
2737 1.1 mrg is executed under the same condition as the references. */
2738 1.1 mrg || (gimple_bb (stmt) != gimple_bb (r1->stmt)
2739 1.1 mrg && gimple_bb (stmt) != gimple_bb (r2->stmt)))
2740 1.1 mrg return false;
2741 1.1 mrg
2742 1.1 mrg acode = gimple_assign_rhs_code (stmt);
2743 1.1 mrg aswap = (!commutative_tree_code (acode)
2744 1.1 mrg && gimple_assign_rhs1 (stmt) != name1);
2745 1.1 mrg atype = TREE_TYPE (gimple_assign_lhs (stmt));
2746 1.1 mrg
2747 1.1 mrg if (*code == ERROR_MARK)
2748 1.1 mrg {
2749 1.1 mrg *code = acode;
2750 1.1 mrg *swap = aswap;
2751 1.1 mrg *rslt_type = atype;
2752 1.1 mrg return true;
2753 1.1 mrg }
2754 1.1 mrg
2755 1.1 mrg return (*code == acode
2756 1.1 mrg && *swap == aswap
2757 1.1 mrg && *rslt_type == atype);
2758 1.1 mrg }
2759 1.1 mrg
2760 1.1 mrg /* Remove OP from the operation on rhs of STMT, and replace STMT with
2761 1.1 mrg an assignment of the remaining operand. */
2762 1.1 mrg
2763 1.1 mrg static void
2764 1.1 mrg remove_name_from_operation (gimple *stmt, tree op)
2765 1.1 mrg {
2766 1.1 mrg tree other_op;
2767 1.1 mrg gimple_stmt_iterator si;
2768 1.1 mrg
2769 1.1 mrg gcc_assert (is_gimple_assign (stmt));
2770 1.1 mrg
2771 1.1 mrg if (gimple_assign_rhs1 (stmt) == op)
2772 1.1 mrg other_op = gimple_assign_rhs2 (stmt);
2773 1.1 mrg else
2774 1.1 mrg other_op = gimple_assign_rhs1 (stmt);
2775 1.1 mrg
2776 1.1 mrg si = gsi_for_stmt (stmt);
2777 1.1 mrg gimple_assign_set_rhs_from_tree (&si, other_op);
2778 1.1 mrg
2779 1.1 mrg /* We should not have reallocated STMT. */
2780 1.1 mrg gcc_assert (gsi_stmt (si) == stmt);
2781 1.1 mrg
2782 1.1 mrg update_stmt (stmt);
2783 1.1 mrg }
2784 1.1 mrg
2785 1.1 mrg /* Reassociates the expression in that NAME1 and NAME2 are used so that they
2786 1.1 mrg are combined in a single statement, and returns this statement. */
2787 1.1 mrg
2788 1.1 mrg gimple *
2789 1.1 mrg pcom_worker::reassociate_to_the_same_stmt (tree name1, tree name2)
2790 1.1 mrg {
2791 1.1 mrg gimple *stmt1, *stmt2, *root1, *root2, *s1, *s2;
2792 1.1 mrg gassign *new_stmt, *tmp_stmt;
2793 1.1 mrg tree new_name, tmp_name, var, r1, r2;
2794 1.1 mrg unsigned dist1, dist2;
2795 1.1 mrg enum tree_code code;
2796 1.1 mrg tree type = TREE_TYPE (name1);
2797 1.1 mrg gimple_stmt_iterator bsi;
2798 1.1 mrg
2799 1.1 mrg stmt1 = find_use_stmt (&name1);
2800 1.1 mrg stmt2 = find_use_stmt (&name2);
2801 1.1 mrg root1 = find_associative_operation_root (stmt1, &dist1);
2802 1.1 mrg root2 = find_associative_operation_root (stmt2, &dist2);
2803 1.1 mrg code = gimple_assign_rhs_code (stmt1);
2804 1.1 mrg
2805 1.1 mrg gcc_assert (root1 && root2 && root1 == root2
2806 1.1 mrg && code == gimple_assign_rhs_code (stmt2));
2807 1.1 mrg
2808 1.1 mrg /* Find the root of the nearest expression in that both NAME1 and NAME2
2809 1.1 mrg are used. */
2810 1.1 mrg r1 = name1;
2811 1.1 mrg s1 = stmt1;
2812 1.1 mrg r2 = name2;
2813 1.1 mrg s2 = stmt2;
2814 1.1 mrg
2815 1.1 mrg while (dist1 > dist2)
2816 1.1 mrg {
2817 1.1 mrg s1 = find_use_stmt (&r1);
2818 1.1 mrg r1 = gimple_assign_lhs (s1);
2819 1.1 mrg dist1--;
2820 1.1 mrg }
2821 1.1 mrg while (dist2 > dist1)
2822 1.1 mrg {
2823 1.1 mrg s2 = find_use_stmt (&r2);
2824 1.1 mrg r2 = gimple_assign_lhs (s2);
2825 1.1 mrg dist2--;
2826 1.1 mrg }
2827 1.1 mrg
2828 1.1 mrg while (s1 != s2)
2829 1.1 mrg {
2830 1.1 mrg s1 = find_use_stmt (&r1);
2831 1.1 mrg r1 = gimple_assign_lhs (s1);
2832 1.1 mrg s2 = find_use_stmt (&r2);
2833 1.1 mrg r2 = gimple_assign_lhs (s2);
2834 1.1 mrg }
2835 1.1 mrg
2836 1.1 mrg /* Remove NAME1 and NAME2 from the statements in that they are used
2837 1.1 mrg currently. */
2838 1.1 mrg remove_name_from_operation (stmt1, name1);
2839 1.1 mrg remove_name_from_operation (stmt2, name2);
2840 1.1 mrg
2841 1.1 mrg /* Insert the new statement combining NAME1 and NAME2 before S1, and
2842 1.1 mrg combine it with the rhs of S1. */
2843 1.1 mrg var = create_tmp_reg (type, "predreastmp");
2844 1.1 mrg new_name = make_ssa_name (var);
2845 1.1 mrg new_stmt = gimple_build_assign (new_name, code, name1, name2);
2846 1.1 mrg
2847 1.1 mrg var = create_tmp_reg (type, "predreastmp");
2848 1.1 mrg tmp_name = make_ssa_name (var);
2849 1.1 mrg
2850 1.1 mrg /* Rhs of S1 may now be either a binary expression with operation
2851 1.1 mrg CODE, or gimple_val (in case that stmt1 == s1 or stmt2 == s1,
2852 1.1 mrg so that name1 or name2 was removed from it). */
2853 1.1 mrg tmp_stmt = gimple_build_assign (tmp_name, gimple_assign_rhs_code (s1),
2854 1.1 mrg gimple_assign_rhs1 (s1),
2855 1.1 mrg gimple_assign_rhs2 (s1));
2856 1.1 mrg
2857 1.1 mrg bsi = gsi_for_stmt (s1);
2858 1.1 mrg gimple_assign_set_rhs_with_ops (&bsi, code, new_name, tmp_name);
2859 1.1 mrg s1 = gsi_stmt (bsi);
2860 1.1 mrg update_stmt (s1);
2861 1.1 mrg
2862 1.1 mrg gsi_insert_before (&bsi, new_stmt, GSI_SAME_STMT);
2863 1.1 mrg gsi_insert_before (&bsi, tmp_stmt, GSI_SAME_STMT);
2864 1.1 mrg
2865 1.1 mrg return new_stmt;
2866 1.1 mrg }
2867 1.1 mrg
2868 1.1 mrg /* Returns the statement that combines references R1 and R2. In case R1
2869 1.1 mrg and R2 are not used in the same statement, but they are used with an
2870 1.1 mrg associative and commutative operation in the same expression, reassociate
2871 1.1 mrg the expression so that they are used in the same statement. */
2872 1.1 mrg
2873 1.1 mrg gimple *
2874 1.1 mrg pcom_worker::stmt_combining_refs (dref r1, dref r2)
2875 1.1 mrg {
2876 1.1 mrg gimple *stmt1, *stmt2;
2877 1.1 mrg tree name1 = name_for_ref (r1);
2878 1.1 mrg tree name2 = name_for_ref (r2);
2879 1.1 mrg
2880 1.1 mrg stmt1 = find_use_stmt (&name1);
2881 1.1 mrg stmt2 = find_use_stmt (&name2);
2882 1.1 mrg if (stmt1 == stmt2)
2883 1.1 mrg return stmt1;
2884 1.1 mrg
2885 1.1 mrg return reassociate_to_the_same_stmt (name1, name2);
2886 1.1 mrg }
2887 1.1 mrg
2888 1.1 mrg /* Tries to combine chains CH1 and CH2 together. If this succeeds, the
2889 1.1 mrg description of the new chain is returned, otherwise we return NULL. */
2890 1.1 mrg
2891 1.1 mrg chain_p
2892 1.1 mrg pcom_worker::combine_chains (chain_p ch1, chain_p ch2)
2893 1.1 mrg {
2894 1.1 mrg dref r1, r2, nw;
2895 1.1 mrg enum tree_code op = ERROR_MARK;
2896 1.1 mrg bool swap = false;
2897 1.1 mrg chain_p new_chain;
2898 1.1 mrg unsigned i;
2899 1.1 mrg tree rslt_type = NULL_TREE;
2900 1.1 mrg
2901 1.1 mrg if (ch1 == ch2)
2902 1.1 mrg return NULL;
2903 1.1 mrg if (ch1->length != ch2->length)
2904 1.1 mrg return NULL;
2905 1.1 mrg
2906 1.1 mrg if (ch1->refs.length () != ch2->refs.length ())
2907 1.1 mrg return NULL;
2908 1.1 mrg
2909 1.1 mrg for (i = 0; (ch1->refs.iterate (i, &r1)
2910 1.1 mrg && ch2->refs.iterate (i, &r2)); i++)
2911 1.1 mrg {
2912 1.1 mrg if (r1->distance != r2->distance)
2913 1.1 mrg return NULL;
2914 1.1 mrg
2915 1.1 mrg if (!combinable_refs_p (r1, r2, &op, &swap, &rslt_type))
2916 1.1 mrg return NULL;
2917 1.1 mrg }
2918 1.1 mrg
2919 1.1 mrg if (swap)
2920 1.1 mrg std::swap (ch1, ch2);
2921 1.1 mrg
2922 1.1 mrg new_chain = new struct chain (CT_COMBINATION);
2923 1.1 mrg new_chain->op = op;
2924 1.1 mrg new_chain->ch1 = ch1;
2925 1.1 mrg new_chain->ch2 = ch2;
2926 1.1 mrg new_chain->rslt_type = rslt_type;
2927 1.1 mrg new_chain->length = ch1->length;
2928 1.1 mrg
2929 1.1 mrg for (i = 0; (ch1->refs.iterate (i, &r1)
2930 1.1 mrg && ch2->refs.iterate (i, &r2)); i++)
2931 1.1 mrg {
2932 1.1 mrg nw = XCNEW (class dref_d);
2933 1.1 mrg nw->stmt = stmt_combining_refs (r1, r2);
2934 1.1 mrg nw->distance = r1->distance;
2935 1.1 mrg
2936 1.1 mrg new_chain->refs.safe_push (nw);
2937 1.1 mrg }
2938 1.1 mrg
2939 1.1 mrg ch1->combined = true;
2940 1.1 mrg ch2->combined = true;
2941 1.1 mrg return new_chain;
2942 1.1 mrg }
2943 1.1 mrg
2944 1.1 mrg /* Recursively update position information of all offspring chains to ROOT
2945 1.1 mrg chain's position information. */
2946 1.1 mrg
2947 1.1 mrg static void
2948 1.1 mrg update_pos_for_combined_chains (chain_p root)
2949 1.1 mrg {
2950 1.1 mrg chain_p ch1 = root->ch1, ch2 = root->ch2;
2951 1.1 mrg dref ref, ref1, ref2;
2952 1.1 mrg for (unsigned j = 0; (root->refs.iterate (j, &ref)
2953 1.1 mrg && ch1->refs.iterate (j, &ref1)
2954 1.1 mrg && ch2->refs.iterate (j, &ref2)); ++j)
2955 1.1 mrg ref1->pos = ref2->pos = ref->pos;
2956 1.1 mrg
2957 1.1 mrg if (ch1->type == CT_COMBINATION)
2958 1.1 mrg update_pos_for_combined_chains (ch1);
2959 1.1 mrg if (ch2->type == CT_COMBINATION)
2960 1.1 mrg update_pos_for_combined_chains (ch2);
2961 1.1 mrg }
2962 1.1 mrg
2963 1.1 mrg /* Returns true if statement S1 dominates statement S2. */
2964 1.1 mrg
2965 1.1 mrg static bool
2966 1.1 mrg pcom_stmt_dominates_stmt_p (gimple *s1, gimple *s2)
2967 1.1 mrg {
2968 1.1 mrg basic_block bb1 = gimple_bb (s1), bb2 = gimple_bb (s2);
2969 1.1 mrg
2970 1.1 mrg if (!bb1 || s1 == s2)
2971 1.1 mrg return true;
2972 1.1 mrg
2973 1.1 mrg if (bb1 == bb2)
2974 1.1 mrg return gimple_uid (s1) < gimple_uid (s2);
2975 1.1 mrg
2976 1.1 mrg return dominated_by_p (CDI_DOMINATORS, bb2, bb1);
2977 1.1 mrg }
2978 1.1 mrg
2979 1.1 mrg /* Try to combine the chains. */
2980 1.1 mrg
2981 1.1 mrg void
2982 1.1 mrg pcom_worker::try_combine_chains ()
2983 1.1 mrg {
2984 1.1 mrg unsigned i, j;
2985 1.1 mrg chain_p ch1, ch2, cch;
2986 1.1 mrg auto_vec<chain_p> worklist;
2987 1.1 mrg bool combined_p = false;
2988 1.1 mrg
2989 1.1 mrg FOR_EACH_VEC_ELT (m_chains, i, ch1)
2990 1.1 mrg if (chain_can_be_combined_p (ch1))
2991 1.1 mrg worklist.safe_push (ch1);
2992 1.1 mrg
2993 1.1 mrg while (!worklist.is_empty ())
2994 1.1 mrg {
2995 1.1 mrg ch1 = worklist.pop ();
2996 1.1 mrg if (!chain_can_be_combined_p (ch1))
2997 1.1 mrg continue;
2998 1.1 mrg
2999 1.1 mrg FOR_EACH_VEC_ELT (m_chains, j, ch2)
3000 1.1 mrg {
3001 1.1 mrg if (!chain_can_be_combined_p (ch2))
3002 1.1 mrg continue;
3003 1.1 mrg
3004 1.1 mrg cch = combine_chains (ch1, ch2);
3005 1.1 mrg if (cch)
3006 1.1 mrg {
3007 1.1 mrg worklist.safe_push (cch);
3008 1.1 mrg m_chains.safe_push (cch);
3009 1.1 mrg combined_p = true;
3010 1.1 mrg break;
3011 1.1 mrg }
3012 1.1 mrg }
3013 1.1 mrg }
3014 1.1 mrg if (!combined_p)
3015 1.1 mrg return;
3016 1.1 mrg
3017 1.1 mrg /* Setup UID for all statements in dominance order. */
3018 1.1 mrg basic_block *bbs = get_loop_body_in_dom_order (m_loop);
3019 1.1 mrg renumber_gimple_stmt_uids_in_blocks (bbs, m_loop->num_nodes);
3020 1.1 mrg free (bbs);
3021 1.1 mrg
3022 1.1 mrg /* Re-association in combined chains may generate statements different to
3023 1.1 mrg order of references of the original chain. We need to keep references
3024 1.1 mrg of combined chain in dominance order so that all uses will be inserted
3025 1.1 mrg after definitions. Note:
3026 1.1 mrg A) This is necessary for all combined chains.
3027 1.1 mrg B) This is only necessary for ZERO distance references because other
3028 1.1 mrg references inherit value from loop carried PHIs.
3029 1.1 mrg
3030 1.1 mrg We first update position information for all combined chains. */
3031 1.1 mrg dref ref;
3032 1.1 mrg for (i = 0; m_chains.iterate (i, &ch1); ++i)
3033 1.1 mrg {
3034 1.1 mrg if (ch1->type != CT_COMBINATION || ch1->combined)
3035 1.1 mrg continue;
3036 1.1 mrg
3037 1.1 mrg for (j = 0; ch1->refs.iterate (j, &ref); ++j)
3038 1.1 mrg ref->pos = gimple_uid (ref->stmt);
3039 1.1 mrg
3040 1.1 mrg update_pos_for_combined_chains (ch1);
3041 1.1 mrg }
3042 1.1 mrg /* Then sort references according to newly updated position information. */
3043 1.1 mrg for (i = 0; m_chains.iterate (i, &ch1); ++i)
3044 1.1 mrg {
3045 1.1 mrg if (ch1->type != CT_COMBINATION && !ch1->combined)
3046 1.1 mrg continue;
3047 1.1 mrg
3048 1.1 mrg /* Find the first reference with non-ZERO distance. */
3049 1.1 mrg if (ch1->length == 0)
3050 1.1 mrg j = ch1->refs.length();
3051 1.1 mrg else
3052 1.1 mrg {
3053 1.1 mrg for (j = 0; ch1->refs.iterate (j, &ref); ++j)
3054 1.1 mrg if (ref->distance != 0)
3055 1.1 mrg break;
3056 1.1 mrg }
3057 1.1 mrg
3058 1.1 mrg /* Sort all ZERO distance references by position. */
3059 1.1 mrg qsort (&ch1->refs[0], j, sizeof (ch1->refs[0]), order_drefs_by_pos);
3060 1.1 mrg
3061 1.1 mrg if (ch1->combined)
3062 1.1 mrg continue;
3063 1.1 mrg
3064 1.1 mrg /* For ZERO length chain, has_max_use_after must be true since root
3065 1.1 mrg combined stmt must dominates others. */
3066 1.1 mrg if (ch1->length == 0)
3067 1.1 mrg {
3068 1.1 mrg ch1->has_max_use_after = true;
3069 1.1 mrg continue;
3070 1.1 mrg }
3071 1.1 mrg /* Check if there is use at max distance after root for combined chains
3072 1.1 mrg and set flag accordingly. */
3073 1.1 mrg ch1->has_max_use_after = false;
3074 1.1 mrg gimple *root_stmt = get_chain_root (ch1)->stmt;
3075 1.1 mrg for (j = 1; ch1->refs.iterate (j, &ref); ++j)
3076 1.1 mrg {
3077 1.1 mrg if (ref->distance == ch1->length
3078 1.1 mrg && !pcom_stmt_dominates_stmt_p (ref->stmt, root_stmt))
3079 1.1 mrg {
3080 1.1 mrg ch1->has_max_use_after = true;
3081 1.1 mrg break;
3082 1.1 mrg }
3083 1.1 mrg }
3084 1.1 mrg }
3085 1.1 mrg }
3086 1.1 mrg
3087 1.1 mrg /* Prepare initializers for store elimination CHAIN in LOOP. Returns false
3088 1.1 mrg if this is impossible because one of these initializers may trap, true
3089 1.1 mrg otherwise. */
3090 1.1 mrg
3091 1.1 mrg static bool
3092 1.1 mrg prepare_initializers_chain_store_elim (class loop *loop, chain_p chain)
3093 1.1 mrg {
3094 1.1 mrg unsigned i, n = chain->length;
3095 1.1 mrg
3096 1.1 mrg /* For now we can't eliminate stores if some of them are conditional
3097 1.1 mrg executed. */
3098 1.1 mrg if (!chain->all_always_accessed)
3099 1.1 mrg return false;
3100 1.1 mrg
3101 1.1 mrg /* Nothing to intialize for intra-iteration store elimination. */
3102 1.1 mrg if (n == 0 && chain->type == CT_STORE_STORE)
3103 1.1 mrg return true;
3104 1.1 mrg
3105 1.1 mrg /* For store elimination chain, there is nothing to initialize if stores
3106 1.1 mrg to be eliminated only store loop invariant values into memory. */
3107 1.1 mrg if (chain->type == CT_STORE_STORE
3108 1.1 mrg && is_inv_store_elimination_chain (loop, chain))
3109 1.1 mrg {
3110 1.1 mrg chain->inv_store_elimination = true;
3111 1.1 mrg return true;
3112 1.1 mrg }
3113 1.1 mrg
3114 1.1 mrg chain->inits.create (n);
3115 1.1 mrg chain->inits.safe_grow_cleared (n, true);
3116 1.1 mrg
3117 1.1 mrg /* For store eliminatin chain like below:
3118 1.1 mrg
3119 1.1 mrg for (i = 0; i < len; i++)
3120 1.1 mrg {
3121 1.1 mrg a[i] = 1;
3122 1.1 mrg // a[i + 1] = ...
3123 1.1 mrg a[i + 2] = 3;
3124 1.1 mrg }
3125 1.1 mrg
3126 1.1 mrg store to a[i + 1] is missed in loop body, it acts like bubbles. The
3127 1.1 mrg content of a[i + 1] remain the same if the loop iterates fewer times
3128 1.1 mrg than chain->length. We need to set up root variables for such stores
3129 1.1 mrg by loading from memory before loop. Note we only need to load bubble
3130 1.1 mrg elements because loop body is guaranteed to be executed at least once
3131 1.1 mrg after loop's preheader edge. */
3132 1.1 mrg auto_vec<bool> bubbles;
3133 1.1 mrg bubbles.safe_grow_cleared (n + 1, true);
3134 1.1 mrg for (i = 0; i < chain->refs.length (); i++)
3135 1.1 mrg bubbles[chain->refs[i]->distance] = true;
3136 1.1 mrg
3137 1.1 mrg struct data_reference *dr = get_chain_root (chain)->ref;
3138 1.1 mrg for (i = 0; i < n; i++)
3139 1.1 mrg {
3140 1.1 mrg if (bubbles[i])
3141 1.1 mrg continue;
3142 1.1 mrg
3143 1.1 mrg gimple_seq stmts = NULL;
3144 1.1 mrg
3145 1.1 mrg tree init = ref_at_iteration (dr, (int) 0 - i, &stmts);
3146 1.1 mrg if (stmts)
3147 1.1 mrg gimple_seq_add_seq_without_update (&chain->init_seq, stmts);
3148 1.1 mrg
3149 1.1 mrg chain->inits[i] = init;
3150 1.1 mrg }
3151 1.1 mrg
3152 1.1 mrg return true;
3153 1.1 mrg }
3154 1.1 mrg
3155 1.1 mrg /* Prepare initializers for CHAIN. Returns false if this is impossible
3156 1.1 mrg because one of these initializers may trap, true otherwise. */
3157 1.1 mrg
3158 1.1 mrg bool
3159 1.1 mrg pcom_worker::prepare_initializers_chain (chain_p chain)
3160 1.1 mrg {
3161 1.1 mrg unsigned i, n = (chain->type == CT_INVARIANT) ? 1 : chain->length;
3162 1.1 mrg struct data_reference *dr = get_chain_root (chain)->ref;
3163 1.1 mrg tree init;
3164 1.1 mrg dref laref;
3165 1.1 mrg edge entry = loop_preheader_edge (m_loop);
3166 1.1 mrg
3167 1.1 mrg if (chain->type == CT_STORE_STORE)
3168 1.1 mrg return prepare_initializers_chain_store_elim (m_loop, chain);
3169 1.1 mrg
3170 1.1 mrg /* Find the initializers for the variables, and check that they cannot
3171 1.1 mrg trap. */
3172 1.1 mrg chain->inits.create (n);
3173 1.1 mrg for (i = 0; i < n; i++)
3174 1.1 mrg chain->inits.quick_push (NULL_TREE);
3175 1.1 mrg
3176 1.1 mrg /* If we have replaced some looparound phi nodes, use their initializers
3177 1.1 mrg instead of creating our own. */
3178 1.1 mrg FOR_EACH_VEC_ELT (chain->refs, i, laref)
3179 1.1 mrg {
3180 1.1 mrg if (gimple_code (laref->stmt) != GIMPLE_PHI)
3181 1.1 mrg continue;
3182 1.1 mrg
3183 1.1 mrg gcc_assert (laref->distance > 0);
3184 1.1 mrg chain->inits[n - laref->distance]
3185 1.1 mrg = PHI_ARG_DEF_FROM_EDGE (laref->stmt, entry);
3186 1.1 mrg }
3187 1.1 mrg
3188 1.1 mrg for (i = 0; i < n; i++)
3189 1.1 mrg {
3190 1.1 mrg gimple_seq stmts = NULL;
3191 1.1 mrg
3192 1.1 mrg if (chain->inits[i] != NULL_TREE)
3193 1.1 mrg continue;
3194 1.1 mrg
3195 1.1 mrg init = ref_at_iteration (dr, (int) i - n, &stmts);
3196 1.1 mrg if (!chain->all_always_accessed && tree_could_trap_p (init))
3197 1.1 mrg {
3198 1.1 mrg gimple_seq_discard (stmts);
3199 1.1 mrg return false;
3200 1.1 mrg }
3201 1.1 mrg
3202 1.1 mrg if (stmts)
3203 1.1 mrg gimple_seq_add_seq_without_update (&chain->init_seq, stmts);
3204 1.1 mrg
3205 1.1 mrg chain->inits[i] = init;
3206 1.1 mrg }
3207 1.1 mrg
3208 1.1 mrg return true;
3209 1.1 mrg }
3210 1.1 mrg
3211 1.1 mrg /* Prepare initializers for chains, and free chains that cannot
3212 1.1 mrg be used because the initializers might trap. */
3213 1.1 mrg
3214 1.1 mrg void
3215 1.1 mrg pcom_worker::prepare_initializers ()
3216 1.1 mrg {
3217 1.1 mrg chain_p chain;
3218 1.1 mrg unsigned i;
3219 1.1 mrg
3220 1.1 mrg for (i = 0; i < m_chains.length (); )
3221 1.1 mrg {
3222 1.1 mrg chain = m_chains[i];
3223 1.1 mrg if (prepare_initializers_chain (chain))
3224 1.1 mrg i++;
3225 1.1 mrg else
3226 1.1 mrg {
3227 1.1 mrg release_chain (chain);
3228 1.1 mrg m_chains.unordered_remove (i);
3229 1.1 mrg }
3230 1.1 mrg }
3231 1.1 mrg }
3232 1.1 mrg
3233 1.1 mrg /* Generates finalizer memory references for CHAIN. Returns true
3234 1.1 mrg if finalizer code for CHAIN can be generated, otherwise false. */
3235 1.1 mrg
3236 1.1 mrg bool
3237 1.1 mrg pcom_worker::prepare_finalizers_chain (chain_p chain)
3238 1.1 mrg {
3239 1.1 mrg unsigned i, n = chain->length;
3240 1.1 mrg struct data_reference *dr = get_chain_root (chain)->ref;
3241 1.1 mrg tree fini, niters = number_of_latch_executions (m_loop);
3242 1.1 mrg
3243 1.1 mrg /* For now we can't eliminate stores if some of them are conditional
3244 1.1 mrg executed. */
3245 1.1 mrg if (!chain->all_always_accessed)
3246 1.1 mrg return false;
3247 1.1 mrg
3248 1.1 mrg chain->finis.create (n);
3249 1.1 mrg for (i = 0; i < n; i++)
3250 1.1 mrg chain->finis.quick_push (NULL_TREE);
3251 1.1 mrg
3252 1.1 mrg /* We never use looparound phi node for store elimination chains. */
3253 1.1 mrg
3254 1.1 mrg /* Find the finalizers for the variables, and check that they cannot
3255 1.1 mrg trap. */
3256 1.1 mrg for (i = 0; i < n; i++)
3257 1.1 mrg {
3258 1.1 mrg gimple_seq stmts = NULL;
3259 1.1 mrg gcc_assert (chain->finis[i] == NULL_TREE);
3260 1.1 mrg
3261 1.1 mrg if (TREE_CODE (niters) != INTEGER_CST && TREE_CODE (niters) != SSA_NAME)
3262 1.1 mrg {
3263 1.1 mrg niters = unshare_expr (niters);
3264 1.1 mrg niters = force_gimple_operand (niters, &stmts, true, NULL);
3265 1.1 mrg if (stmts)
3266 1.1 mrg {
3267 1.1 mrg gimple_seq_add_seq_without_update (&chain->fini_seq, stmts);
3268 1.1 mrg stmts = NULL;
3269 1.1 mrg }
3270 1.1 mrg }
3271 1.1 mrg fini = ref_at_iteration (dr, (int) 0 - i, &stmts, niters);
3272 1.1 mrg if (stmts)
3273 1.1 mrg gimple_seq_add_seq_without_update (&chain->fini_seq, stmts);
3274 1.1 mrg
3275 1.1 mrg chain->finis[i] = fini;
3276 1.1 mrg }
3277 1.1 mrg
3278 1.1 mrg return true;
3279 1.1 mrg }
3280 1.1 mrg
3281 1.1 mrg /* Generates finalizer memory reference for chains. Returns true if
3282 1.1 mrg finalizer code generation for chains breaks loop closed ssa form. */
3283 1.1 mrg
3284 1.1 mrg bool
3285 1.1 mrg pcom_worker::prepare_finalizers ()
3286 1.1 mrg {
3287 1.1 mrg chain_p chain;
3288 1.1 mrg unsigned i;
3289 1.1 mrg bool loop_closed_ssa = false;
3290 1.1 mrg
3291 1.1 mrg for (i = 0; i < m_chains.length ();)
3292 1.1 mrg {
3293 1.1 mrg chain = m_chains[i];
3294 1.1 mrg
3295 1.1 mrg /* Finalizer is only necessary for inter-iteration store elimination
3296 1.1 mrg chains. */
3297 1.1 mrg if (chain->length == 0 || chain->type != CT_STORE_STORE)
3298 1.1 mrg {
3299 1.1 mrg i++;
3300 1.1 mrg continue;
3301 1.1 mrg }
3302 1.1 mrg
3303 1.1 mrg if (prepare_finalizers_chain (chain))
3304 1.1 mrg {
3305 1.1 mrg i++;
3306 1.1 mrg /* Be conservative, assume loop closed ssa form is corrupted
3307 1.1 mrg by store-store chain. Though it's not always the case if
3308 1.1 mrg eliminated stores only store loop invariant values into
3309 1.1 mrg memory. */
3310 1.1 mrg loop_closed_ssa = true;
3311 1.1 mrg }
3312 1.1 mrg else
3313 1.1 mrg {
3314 1.1 mrg release_chain (chain);
3315 1.1 mrg m_chains.unordered_remove (i);
3316 1.1 mrg }
3317 1.1 mrg }
3318 1.1 mrg return loop_closed_ssa;
3319 1.1 mrg }
3320 1.1 mrg
3321 1.1 mrg /* Insert all initializing gimple stmts into LOOP's entry edge. */
3322 1.1 mrg
3323 1.1 mrg static void
3324 1.1 mrg insert_init_seqs (class loop *loop, vec<chain_p> &chains)
3325 1.1 mrg {
3326 1.1 mrg unsigned i;
3327 1.1 mrg edge entry = loop_preheader_edge (loop);
3328 1.1 mrg
3329 1.1 mrg for (i = 0; i < chains.length (); ++i)
3330 1.1 mrg if (chains[i]->init_seq)
3331 1.1 mrg {
3332 1.1 mrg gsi_insert_seq_on_edge_immediate (entry, chains[i]->init_seq);
3333 1.1 mrg chains[i]->init_seq = NULL;
3334 1.1 mrg }
3335 1.1 mrg }
3336 1.1 mrg
3337 1.1 mrg /* Performs predictive commoning for LOOP. Sets bit 1<<1 of return value
3338 1.1 mrg if LOOP was unrolled; Sets bit 1<<2 of return value if loop closed ssa
3339 1.1 mrg form was corrupted. Non-zero return value indicates some changes were
3340 1.1 mrg applied to this loop. */
3341 1.1 mrg
3342 1.1 mrg unsigned
3343 1.1 mrg pcom_worker::tree_predictive_commoning_loop (bool allow_unroll_p)
3344 1.1 mrg {
3345 1.1 mrg struct component *components;
3346 1.1 mrg unsigned unroll_factor = 0;
3347 1.1 mrg class tree_niter_desc desc;
3348 1.1 mrg bool unroll = false, loop_closed_ssa = false;
3349 1.1 mrg
3350 1.1 mrg if (dump_file && (dump_flags & TDF_DETAILS))
3351 1.1 mrg fprintf (dump_file, "Processing loop %d\n", m_loop->num);
3352 1.1 mrg
3353 1.1 mrg /* Nothing for predicitive commoning if loop only iterates 1 time. */
3354 1.1 mrg if (get_max_loop_iterations_int (m_loop) == 0)
3355 1.1 mrg {
3356 1.1 mrg if (dump_file && (dump_flags & TDF_DETAILS))
3357 1.1 mrg fprintf (dump_file, "Loop iterates only 1 time, nothing to do.\n");
3358 1.1 mrg
3359 1.1 mrg return 0;
3360 1.1 mrg }
3361 1.1 mrg
3362 1.1 mrg /* Find the data references and split them into components according to their
3363 1.1 mrg dependence relations. */
3364 1.1 mrg auto_vec<loop_p, 3> loop_nest;
3365 1.1 mrg if (!compute_data_dependences_for_loop (m_loop, true, &loop_nest, &m_datarefs,
3366 1.1 mrg &m_dependences))
3367 1.1 mrg {
3368 1.1 mrg if (dump_file && (dump_flags & TDF_DETAILS))
3369 1.1 mrg fprintf (dump_file, "Cannot analyze data dependencies\n");
3370 1.1 mrg return 0;
3371 1.1 mrg }
3372 1.1 mrg
3373 1.1 mrg if (dump_file && (dump_flags & TDF_DETAILS))
3374 1.1 mrg dump_data_dependence_relations (dump_file, m_dependences);
3375 1.1 mrg
3376 1.1 mrg components = split_data_refs_to_components ();
3377 1.1 mrg
3378 1.1 mrg loop_nest.release ();
3379 1.1 mrg if (!components)
3380 1.1 mrg return 0;
3381 1.1 mrg
3382 1.1 mrg if (dump_file && (dump_flags & TDF_DETAILS))
3383 1.1 mrg {
3384 1.1 mrg fprintf (dump_file, "Initial state:\n\n");
3385 1.1 mrg dump_components (dump_file, components);
3386 1.1 mrg }
3387 1.1 mrg
3388 1.1 mrg /* Find the suitable components and split them into chains. */
3389 1.1 mrg components = filter_suitable_components (components);
3390 1.1 mrg
3391 1.1 mrg auto_bitmap tmp_vars;
3392 1.1 mrg determine_roots (components);
3393 1.1 mrg release_components (components);
3394 1.1 mrg
3395 1.1 mrg if (!m_chains.exists ())
3396 1.1 mrg {
3397 1.1 mrg if (dump_file && (dump_flags & TDF_DETAILS))
3398 1.1 mrg fprintf (dump_file,
3399 1.1 mrg "Predictive commoning failed: no suitable chains\n");
3400 1.1 mrg return 0;
3401 1.1 mrg }
3402 1.1 mrg
3403 1.1 mrg prepare_initializers ();
3404 1.1 mrg loop_closed_ssa = prepare_finalizers ();
3405 1.1 mrg
3406 1.1 mrg /* Try to combine the chains that are always worked with together. */
3407 1.1 mrg try_combine_chains ();
3408 1.1 mrg
3409 1.1 mrg insert_init_seqs (m_loop, m_chains);
3410 1.1 mrg
3411 1.1 mrg if (dump_file && (dump_flags & TDF_DETAILS))
3412 1.1 mrg {
3413 1.1 mrg fprintf (dump_file, "Before commoning:\n\n");
3414 1.1 mrg dump_chains (dump_file, m_chains);
3415 1.1 mrg }
3416 1.1 mrg
3417 1.1 mrg if (allow_unroll_p)
3418 1.1 mrg /* Determine the unroll factor, and if the loop should be unrolled, ensure
3419 1.1 mrg that its number of iterations is divisible by the factor. */
3420 1.1 mrg unroll_factor = determine_unroll_factor (m_chains);
3421 1.1 mrg
3422 1.1 mrg if (unroll_factor > 1)
3423 1.1 mrg unroll = can_unroll_loop_p (m_loop, unroll_factor, &desc);
3424 1.1 mrg
3425 1.1 mrg /* Execute the predictive commoning transformations, and possibly unroll the
3426 1.1 mrg loop. */
3427 1.1 mrg if (unroll)
3428 1.1 mrg {
3429 1.1 mrg struct epcc_data dta;
3430 1.1 mrg
3431 1.1 mrg if (dump_file && (dump_flags & TDF_DETAILS))
3432 1.1 mrg fprintf (dump_file, "Unrolling %u times.\n", unroll_factor);
3433 1.1 mrg
3434 1.1 mrg dta.tmp_vars = tmp_vars;
3435 1.1 mrg dta.chains = m_chains.to_vec_legacy ();
3436 1.1 mrg dta.worker = this;
3437 1.1 mrg
3438 1.1 mrg /* Cfg manipulations performed in tree_transform_and_unroll_loop before
3439 1.1 mrg execute_pred_commoning_cbck is called may cause phi nodes to be
3440 1.1 mrg reallocated, which is a problem since CHAINS may point to these
3441 1.1 mrg statements. To fix this, we store the ssa names defined by the
3442 1.1 mrg phi nodes here instead of the phi nodes themselves, and restore
3443 1.1 mrg the phi nodes in execute_pred_commoning_cbck. A bit hacky. */
3444 1.1 mrg replace_phis_by_defined_names (m_chains);
3445 1.1 mrg
3446 1.1 mrg tree_transform_and_unroll_loop (m_loop, unroll_factor, &desc,
3447 1.1 mrg execute_pred_commoning_cbck, &dta);
3448 1.1 mrg eliminate_temp_copies (m_loop, tmp_vars);
3449 1.1 mrg }
3450 1.1 mrg else
3451 1.1 mrg {
3452 1.1 mrg if (dump_file && (dump_flags & TDF_DETAILS))
3453 1.1 mrg fprintf (dump_file,
3454 1.1 mrg "Executing predictive commoning without unrolling.\n");
3455 1.1 mrg execute_pred_commoning (tmp_vars);
3456 1.1 mrg }
3457 1.1 mrg
3458 1.1 mrg return (unroll ? 2 : 1) | (loop_closed_ssa ? 4 : 1);
3459 1.1 mrg }
3460 1.1 mrg
3461 1.1 mrg /* Runs predictive commoning. */
3462 1.1 mrg
3463 1.1 mrg unsigned
3464 1.1 mrg tree_predictive_commoning (bool allow_unroll_p)
3465 1.1 mrg {
3466 1.1 mrg unsigned ret = 0, changed = 0;
3467 1.1 mrg
3468 1.1 mrg initialize_original_copy_tables ();
3469 1.1 mrg for (auto loop : loops_list (cfun, LI_ONLY_INNERMOST))
3470 1.1 mrg if (optimize_loop_for_speed_p (loop))
3471 1.1 mrg {
3472 1.1 mrg pcom_worker w(loop);
3473 1.1 mrg changed |= w.tree_predictive_commoning_loop (allow_unroll_p);
3474 1.1 mrg }
3475 1.1 mrg free_original_copy_tables ();
3476 1.1 mrg
3477 1.1 mrg if (changed > 0)
3478 1.1 mrg {
3479 1.1 mrg ret = TODO_update_ssa_only_virtuals;
3480 1.1 mrg
3481 1.1 mrg /* Some loop(s) got unrolled. */
3482 1.1 mrg if (changed > 1)
3483 1.1 mrg {
3484 1.1 mrg scev_reset ();
3485 1.1 mrg
3486 1.1 mrg /* Need to fix up loop closed SSA. */
3487 1.1 mrg if (changed >= 4)
3488 1.1 mrg rewrite_into_loop_closed_ssa (NULL, TODO_update_ssa);
3489 1.1 mrg
3490 1.1 mrg ret |= TODO_cleanup_cfg;
3491 1.1 mrg }
3492 1.1 mrg }
3493 1.1 mrg
3494 1.1 mrg return ret;
3495 1.1 mrg }
3496 1.1 mrg
3497 1.1 mrg /* Predictive commoning Pass. */
3498 1.1 mrg
3499 1.1 mrg static unsigned
3500 1.1 mrg run_tree_predictive_commoning (struct function *fun, bool allow_unroll_p)
3501 1.1 mrg {
3502 1.1 mrg if (number_of_loops (fun) <= 1)
3503 1.1 mrg return 0;
3504 1.1 mrg
3505 1.1 mrg return tree_predictive_commoning (allow_unroll_p);
3506 1.1 mrg }
3507 1.1 mrg
3508 1.1 mrg namespace {
3509 1.1 mrg
3510 1.1 mrg const pass_data pass_data_predcom =
3511 1.1 mrg {
3512 1.1 mrg GIMPLE_PASS, /* type */
3513 1.1 mrg "pcom", /* name */
3514 1.1 mrg OPTGROUP_LOOP, /* optinfo_flags */
3515 1.1 mrg TV_PREDCOM, /* tv_id */
3516 1.1 mrg PROP_cfg, /* properties_required */
3517 1.1 mrg 0, /* properties_provided */
3518 1.1 mrg 0, /* properties_destroyed */
3519 1.1 mrg 0, /* todo_flags_start */
3520 1.1 mrg 0, /* todo_flags_finish */
3521 1.1 mrg };
3522 1.1 mrg
3523 1.1 mrg class pass_predcom : public gimple_opt_pass
3524 1.1 mrg {
3525 1.1 mrg public:
3526 1.1 mrg pass_predcom (gcc::context *ctxt)
3527 1.1 mrg : gimple_opt_pass (pass_data_predcom, ctxt)
3528 1.1 mrg {}
3529 1.1 mrg
3530 1.1 mrg /* opt_pass methods: */
3531 1.1 mrg virtual bool
3532 1.1 mrg gate (function *)
3533 1.1 mrg {
3534 1.1 mrg if (flag_predictive_commoning != 0)
3535 1.1 mrg return true;
3536 1.1 mrg /* Loop vectorization enables predictive commoning implicitly
3537 1.1 mrg only if predictive commoning isn't set explicitly, and it
3538 1.1 mrg doesn't allow unrolling. */
3539 1.1 mrg if (flag_tree_loop_vectorize
3540 1.1 mrg && !OPTION_SET_P (flag_predictive_commoning))
3541 1.1 mrg return true;
3542 1.1 mrg
3543 1.1 mrg return false;
3544 1.1 mrg }
3545 1.1 mrg
3546 1.1 mrg virtual unsigned int
3547 1.1 mrg execute (function *fun)
3548 1.1 mrg {
3549 1.1 mrg bool allow_unroll_p = flag_predictive_commoning != 0;
3550 1.1 mrg return run_tree_predictive_commoning (fun, allow_unroll_p);
3551 1.1 mrg }
3552 1.1 mrg
3553 1.1 mrg }; // class pass_predcom
3554 1.1 mrg
3555 1.1 mrg } // anon namespace
3556 1.1 mrg
3557 1.1 mrg gimple_opt_pass *
3558 1.1 mrg make_pass_predcom (gcc::context *ctxt)
3559 1.1 mrg {
3560 1.1 mrg return new pass_predcom (ctxt);
3561 1.1 mrg }
3562