strlen.S revision 1.1 1 1.1 christos /* $NetBSD: strlen.S,v 1.1 2005/12/20 19:28:50 christos Exp $ */
2 1.1 christos
3 1.1 christos /*-
4 1.1 christos * Copyright (C) 2001 Martin J. Laubach <mjl (at) NetBSD.org>
5 1.1 christos * All rights reserved.
6 1.1 christos *
7 1.1 christos * Redistribution and use in source and binary forms, with or without
8 1.1 christos * modification, are permitted provided that the following conditions
9 1.1 christos * are met:
10 1.1 christos * 1. Redistributions of source code must retain the above copyright
11 1.1 christos * notice, this list of conditions and the following disclaimer.
12 1.1 christos * 2. Redistributions in binary form must reproduce the above copyright
13 1.1 christos * notice, this list of conditions and the following disclaimer in the
14 1.1 christos * documentation and/or other materials provided with the distribution.
15 1.1 christos * 3. The name of the author may not be used to endorse or promote products
16 1.1 christos * derived from this software without specific prior written permission.
17 1.1 christos *
18 1.1 christos * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
19 1.1 christos * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
20 1.1 christos * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
21 1.1 christos * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
22 1.1 christos * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
23 1.1 christos * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
24 1.1 christos * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
25 1.1 christos * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
26 1.1 christos * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
27 1.1 christos * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
28 1.1 christos */
29 1.1 christos /*----------------------------------------------------------------------*/
30 1.1 christos
31 1.1 christos #include <machine/asm.h>
32 1.1 christos
33 1.1 christos /*----------------------------------------------------------------------*/
34 1.1 christos /* The algorithm here uses the following techniques:
35 1.1 christos
36 1.1 christos 1) Given a word 'x', we can test to see if it contains any 0 bytes
37 1.1 christos by subtracting 0x01010101, and seeing if any of the high bits of each
38 1.1 christos byte changed from 0 to 1. This works because the least significant
39 1.1 christos 0 byte must have had no incoming carry (otherwise it's not the least
40 1.1 christos significant), so it is 0x00 - 0x01 == 0xff. For all other
41 1.1 christos byte values, either they have the high bit set initially, or when
42 1.1 christos 1 is subtracted you get a value in the range 0x00-0x7f, none of which
43 1.1 christos have their high bit set. The expression here is
44 1.1 christos (x + 0xfefefeff) & ~(x | 0x7f7f7f7f), which gives 0x00000000 when
45 1.1 christos there were no 0x00 bytes in the word.
46 1.1 christos
47 1.1 christos 2) Given a word 'x', we can test to see _which_ byte was zero by
48 1.1 christos calculating ~(((x & 0x7f7f7f7f) + 0x7f7f7f7f) | x | 0x7f7f7f7f).
49 1.1 christos This produces 0x80 in each byte that was zero, and 0x00 in all
50 1.1 christos the other bytes. The '| 0x7f7f7f7f' clears the low 7 bits in each
51 1.1 christos byte, and the '| x' part ensures that bytes with the high bit set
52 1.1 christos produce 0x00. The addition will carry into the high bit of each byte
53 1.1 christos iff that byte had one of its low 7 bits set. We can then just see
54 1.1 christos which was the most significant bit set and divide by 8 to find how
55 1.1 christos many to add to the index.
56 1.1 christos This is from the book 'The PowerPC Compiler Writer's Guide',
57 1.1 christos by Steve Hoxey, Faraydon Karim, Bill Hay and Hank Warren.
58 1.1 christos */
59 1.1 christos /*----------------------------------------------------------------------*/
60 1.1 christos
61 1.1 christos .text
62 1.1 christos .align 4
63 1.1 christos
64 1.1 christos ENTRY(strlen)
65 1.1 christos
66 1.1 christos /* Setup constants */
67 1.1 christos lis %r10, 0x7f7f
68 1.1 christos lis %r9, 0xfefe
69 1.1 christos ori %r10, %r10, 0x7f7f
70 1.1 christos ori %r9, %r9, 0xfeff
71 1.1 christos
72 1.1 christos /* Mask out leading bytes on non aligned strings */
73 1.1 christos rlwinm. %r8, %r3, 3, 27, 28 /* leading bits to mask */
74 1.1 christos clrrwi %r5, %r3, 2 /* clear low 2 addr bits */
75 1.1 christos li %r0, -1
76 1.1 christos beq+ 3f /* skip alignment if already */
77 1.1 christos /* aligned */
78 1.1 christos
79 1.1 christos srw %r0, %r0, %r8 /* make 0000...1111 mask */
80 1.1 christos
81 1.1 christos lwz %r7, 0(%r5)
82 1.1 christos nor %r0, %r0, %r0 /* invert mask */
83 1.1 christos or %r7, %r7, %r0 /* make leading bytes != 0 */
84 1.1 christos b 2f
85 1.1 christos
86 1.1 christos 3: subi %r5, %r5, 4
87 1.1 christos
88 1.1 christos 1: lwzu %r7, 4(%r5) /* fetch data word */
89 1.1 christos
90 1.1 christos 2: nor %r0, %r7, %r10 /* do step 1 */
91 1.1 christos add %r6, %r7, %r9
92 1.1 christos and. %r0, %r0, %r6
93 1.1 christos
94 1.1 christos beq+ 1b /* no NUL bytes here */
95 1.1 christos
96 1.1 christos and %r8, %r7, %r10 /* ok, a NUL is somewhere */
97 1.1 christos or %r7, %r7, %r10 /* do step 2 to find out */
98 1.1 christos add %r0, %r8, %r10 /* where */
99 1.1 christos nor %r8, %r7, %r0
100 1.1 christos
101 1.1 christos cntlzw %r0, %r8 /* offset from this word */
102 1.1 christos srwi %r4, %r0, 3
103 1.1 christos
104 1.1 christos add %r4, %r5, %r4 /* r4 contains end pointer */
105 1.1 christos /* NOTE: Keep it so this function returns the end pointer
106 1.1 christos in r4, so we can it use from other str* calls (strcat
107 1.1 christos comes to mind */
108 1.1 christos
109 1.1 christos subf %r3, %r3, %r4
110 1.1 christos blr
111 1.1 christos
112 1.1 christos /*----------------------------------------------------------------------*/
113