1.3 |
| 14-Oct-2003 |
scw | Xscale-optimised mem* routines. Contributed by Wasabi Systems. (Note: memcmp/memset improvements also benefit non-Xscale).
memcmp() - Compare 32-bits at a time if possible. Special-case 6-byte comparisons, for the benefit of the network stack.
memset() - More loop unrolling, plus use of 'strd' instruction, bzero() results in > 100% speedup on Xscale.
memcpy() - Big-endian support, unrolled loops, 'strd/pld', plus special- cases for very common length/alignment combinations. Benchmarks show ~50% improvment on Xscale.
memmove() - Big-endian support. Use fast memcpy(), above, if the regions bcopy() don't overlap. Otherwise unchanged
XXX: The Xscale optimisations are not enabled by default, unless /etc/mk.conf XXX: has the right compiler options. The intention is to pull them in via XXX: something like libxscale.so, selected at runtime by ld.so.conf. XXX: (Big-endian support is not affected by this).
|