Lines Matching defs:that
133 fs_inst::fs_inst(const fs_inst &that)
135 memcpy((void*)this, &that, sizeof(that));
137 this->src = new fs_reg[MAX2(that.sources, 3)];
139 for (unsigned i = 0; i < that.sources; i++)
140 this->src[i] = that.src[i];
172 * components starting from that.
175 * and a portion done using fs_reg::offset, which means that if you have
178 * later notice that those loads are all the same and eliminate the
295 * instruction that is its last use. For a single instruction, the
301 * - Virtual opcodes that translate to multiple instructions in the
321 * that one of the instructions will read from a channel corresponding
624 * else that might disrupt timing) by setting smear to 2 and checking if
625 * that field is != 0.
629 /* Check that there weren't any timestamp reset events (assuming these
630 * were the only two timestamp reads that happened).
646 * is 2 cycles. Remove that overhead, so I can forget about that when
709 * things that are unsupported in SIMD16+ mode, so the compiler can skip the
729 * Returns true if the instruction has a flag that means it won't
733 * when a write to a variable screens off any preceding values that were in
1014 /* Return the subset of flag registers that an instruction could
1084 * Note that this is not the 0 or 1 implied writes in an actual gen
1252 * We can use the fact that bit 15 is the MSB of g0.0:W to accomplish
1433 * FINISHME: One day, we could come up with a way to do this that
1638 * This is useful because it means that (a) inputs not used by the
1650 /* We have enough input varyings that the SF/SBE pipeline stage can't
1652 * in an order that matches the output of the previous pipeline stage
1718 * setup regs, now that the location of the constants has been chosen.
1761 * rule implies that elements within a 'Width' cannot cross GRF
1764 * So, for registers that are large enough, we have to split the exec
1801 /* Rewrite all ATTR file references to the hw grf that they land in. */
1854 * but that's really conservative because it's afraid of doing
1855 * splitting that doesn't result in real progress after the rest of
2022 /* We just found an unused register. This means that we are
2115 * The returned alignment is the smallest (in terms of multiplier) such that
2118 * offset parameters are such that no common alignment is possible.
2126 /* Assert that the alignments agree. */
2136 * offset that is aligned to align.
2212 * that things are properly aligned. The offset into that uniform,
2217 * Everything will be properly aligned relative to that one base.
2247 * here, demoting things that are rarely used in the program first.
2356 /* Now that we know how many regular uniforms we'll push, reduce the
2374 * NOTE: Because we are condensing the params[] array, we know that
2592 /* On Gen8+, the OR instruction can have a source modifier that
2707 /* It's possible that the selected component will be too large and
2755 * Optimize sample messages that have constant zero values for the trailing
2758 * that aren't sent default to zero anyway. This will cause the dead code
2759 * eliminator to remove the MOV instruction that would otherwise be emitted to
2868 /* Check that the FB write sources are fully initialized by the single
3092 * The abs is ensures that the result is 0UD when g3 is -0.0F.
3152 * things that computed the value of all GRFs of the source region. The
3167 * that writes that reg, but it would require smarter
3202 * values that end up in MRFs are shortly before the MRF
3209 * MRF's source GRF that we wanted to rewrite, that stops us.
3224 * compute-to-MRF before that.
3232 /* Found a SEND instruction, which means that there are
3310 /* The optimization below assumes that channel zero is live on thread
3417 /* Now that we have the uniform assigned, go ahead and force it to a vec4. */
3461 /* Clear out the last-write records for MRFs that were overwritten. */
3538 /* Clear the flag for registers that actually got read (as expected). */
3561 * must ensure that there is no destination hazard for the case of ‘write
3570 * same time that both consider ‘r3’ as the target of their final writes.
3588 * we assume that there are no outstanding dependencies on entry to the
3592 /* If we hit control flow, assume that there *are* outstanding
3604 /* We insert our reads as late as possible on the assumption that any
3605 * instruction but a MOV that might have left us an outstanding
3623 /* Clear the flag for registers that actually got read (as expected). */
3668 /* Clear the flag for registers that actually got read (as expected). */
3725 * Note that execution masking for setting up pull constant loads is special:
3726 * the channels that need to be written are unrelated to the current execution
3834 * that into account now.
3930 * If multiplying by an immediate value that fits in 16-bits, do a
3931 * single MUL instruction with that value in the proper location.
3978 * We avoid the shl instruction by realizing that we only want to add
4089 * that access the accumulator implicitly (e.g. MACH). A
4097 * accumulator register that doesn't exist, but on earlier Gen7
4098 * hardware we need to make sure that the quarter control bits are
4785 * than required, we assume that all bindless sampler states are
5026 /* We assume that the driver provided the handle in the top 20 bits so
5350 /* We assume that the driver provided the handle in the top 20 bits so
5759 * some common regioning and execution control restrictions that apply to FPU
5784 * which is the one that is going to limit the overall execution size of
5823 * up with writes to 4 registers and a source that reads 2 registers
5824 * and we may still need to lower all the way to SIMD8 in that case.
5876 /* From the IVB PRMs (applies to other devices that don't have the
5888 * it's hardwired to use NibCtrl+1, at least on HSW), which means that
5926 * empirical testing with existing CTS tests show that they pass just fine
5928 * is that conversion MOVs between HF and F are still mixed-float
5931 * lift the restriction if we can ensure that it is safe though, since these
5956 * various payload size restrictions that apply to sampler message
5980 /* Calculate the number of coordinate components that have to be present
5981 * assuming that additional arguments follow the texel coordinates in the
6001 /* Calculate the total number of argument components that need to be passed
6070 /* The Ivybridge/BayTrail WaCMPInstFlagDepClearedEarly workaround says that
6087 /* The Haswell WaForceSIMD8ForBFIInstruction workaround says that we
6144 * shorter return payload would be to use the SIMD8 sampler message that
6334 * Extract the data that would be consumed by the channel group given by
6384 * the results of multiple lowered instructions in order to make sure that
6422 * the temporary as result. Any copy instructions that are required for
6491 * we're sure that both cases can be handled.
6517 * it off here so that we insert the zip instructions in the right
6521 * instructions will end up in the reverse order that we insert them.
6522 * However, certain render target writes require that the low group
6891 * same order that they appear in the brw_barycentric_mode enum. Each
6975 * Note that the GS reads <URB Read Length> HWords for every vertex - so we
7024 * make sure that optimizations set the execution controls explicitly to
7069 * instruction is encountered, and again when the user of that result is
7183 * "It is required that the second block of GRFs does not overlap with the
7230 * ARF NULL is not allowed. Fix that up by allocating a temporary GRF.
7283 /* We assume that any spilling is worse than just dropping back to
7308 * it inserts dead code that happens to have side effects, and it does
7345 * that we could allocate a larger buffer, and partition it out
7711 * variables so that we catch interpolateAtCentroid() messages too, which
8042 * at the top to select the shader. We've never implemented that.