ssin.sa revision 1.1
11.1Smycroft* MOTOROLA MICROPROCESSOR & MEMORY TECHNOLOGY GROUP 21.1Smycroft* M68000 Hi-Performance Microprocessor Division 31.1Smycroft* M68040 Software Package 41.1Smycroft* 51.1Smycroft* M68040 Software Package Copyright (c) 1993, 1994 Motorola Inc. 61.1Smycroft* All rights reserved. 71.1Smycroft* 81.1Smycroft* THE SOFTWARE is provided on an "AS IS" basis and without warranty. 91.1Smycroft* To the maximum extent permitted by applicable law, 101.1Smycroft* MOTOROLA DISCLAIMS ALL WARRANTIES WHETHER EXPRESS OR IMPLIED, 111.1Smycroft* INCLUDING IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A 121.1Smycroft* PARTICULAR PURPOSE and any warranty against infringement with 131.1Smycroft* regard to the SOFTWARE (INCLUDING ANY MODIFIED VERSIONS THEREOF) 141.1Smycroft* and any accompanying written materials. 151.1Smycroft* 161.1Smycroft* To the maximum extent permitted by applicable law, 171.1Smycroft* IN NO EVENT SHALL MOTOROLA BE LIABLE FOR ANY DAMAGES WHATSOEVER 181.1Smycroft* (INCLUDING WITHOUT LIMITATION, DAMAGES FOR LOSS OF BUSINESS 191.1Smycroft* PROFITS, BUSINESS INTERRUPTION, LOSS OF BUSINESS INFORMATION, OR 201.1Smycroft* OTHER PECUNIARY LOSS) ARISING OF THE USE OR INABILITY TO USE THE 211.1Smycroft* SOFTWARE. Motorola assumes no responsibility for the maintenance 221.1Smycroft* and support of the SOFTWARE. 231.1Smycroft* 241.1Smycroft* You are hereby granted a copyright license to use, modify, and 251.1Smycroft* distribute the SOFTWARE so long as this entire notice is retained 261.1Smycroft* without alteration in any modified and/or redistributed versions, 271.1Smycroft* and that such modified versions are clearly identified as such. 281.1Smycroft* No licenses are granted by implication, estoppel or otherwise 291.1Smycroft* under any patents or trademarks of Motorola, Inc. 301.1Smycroft 311.1Smycroft* 321.1Smycroft* ssin.sa 3.3 7/29/91 331.1Smycroft* 341.1Smycroft* The entry point sSIN computes the sine of an input argument 351.1Smycroft* sCOS computes the cosine, and sSINCOS computes both. The 361.1Smycroft* corresponding entry points with a "d" computes the same 371.1Smycroft* corresponding function values for denormalized inputs. 381.1Smycroft* 391.1Smycroft* Input: Double-extended number X in location pointed to 401.1Smycroft* by address register a0. 411.1Smycroft* 421.1Smycroft* Output: The funtion value sin(X) or cos(X) returned in Fp0 if SIN or 431.1Smycroft* COS is requested. Otherwise, for SINCOS, sin(X) is returned 441.1Smycroft* in Fp0, and cos(X) is returned in Fp1. 451.1Smycroft* 461.1Smycroft* Modifies: Fp0 for SIN or COS; both Fp0 and Fp1 for SINCOS. 471.1Smycroft* 481.1Smycroft* Accuracy and Monotonicity: The returned result is within 1 ulp in 491.1Smycroft* 64 significant bit, i.e. within 0.5001 ulp to 53 bits if the 501.1Smycroft* result is subsequently rounded to double precision. The 511.1Smycroft* result is provably monotonic in double precision. 521.1Smycroft* 531.1Smycroft* Speed: The programs sSIN and sCOS take approximately 150 cycles for 541.1Smycroft* input argument X such that |X| < 15Pi, which is the the usual 551.1Smycroft* situation. The speed for sSINCOS is approximately 190 cycles. 561.1Smycroft* 571.1Smycroft* Algorithm: 581.1Smycroft* 591.1Smycroft* SIN and COS: 601.1Smycroft* 1. If SIN is invoked, set AdjN := 0; otherwise, set AdjN := 1. 611.1Smycroft* 621.1Smycroft* 2. If |X| >= 15Pi or |X| < 2**(-40), go to 7. 631.1Smycroft* 641.1Smycroft* 3. Decompose X as X = N(Pi/2) + r where |r| <= Pi/4. Let 651.1Smycroft* k = N mod 4, so in particular, k = 0,1,2,or 3. Overwirte 661.1Smycroft* k by k := k + AdjN. 671.1Smycroft* 681.1Smycroft* 4. If k is even, go to 6. 691.1Smycroft* 701.1Smycroft* 5. (k is odd) Set j := (k-1)/2, sgn := (-1)**j. Return sgn*cos(r) 711.1Smycroft* where cos(r) is approximated by an even polynomial in r, 721.1Smycroft* 1 + r*r*(B1+s*(B2+ ... + s*B8)), s = r*r. 731.1Smycroft* Exit. 741.1Smycroft* 751.1Smycroft* 6. (k is even) Set j := k/2, sgn := (-1)**j. Return sgn*sin(r) 761.1Smycroft* where sin(r) is approximated by an odd polynomial in r 771.1Smycroft* r + r*s*(A1+s*(A2+ ... + s*A7)), s = r*r. 781.1Smycroft* Exit. 791.1Smycroft* 801.1Smycroft* 7. If |X| > 1, go to 9. 811.1Smycroft* 821.1Smycroft* 8. (|X|<2**(-40)) If SIN is invoked, return X; otherwise return 1. 831.1Smycroft* 841.1Smycroft* 9. Overwrite X by X := X rem 2Pi. Now that |X| <= Pi, go back to 3. 851.1Smycroft* 861.1Smycroft* SINCOS: 871.1Smycroft* 1. If |X| >= 15Pi or |X| < 2**(-40), go to 6. 881.1Smycroft* 891.1Smycroft* 2. Decompose X as X = N(Pi/2) + r where |r| <= Pi/4. Let 901.1Smycroft* k = N mod 4, so in particular, k = 0,1,2,or 3. 911.1Smycroft* 921.1Smycroft* 3. If k is even, go to 5. 931.1Smycroft* 941.1Smycroft* 4. (k is odd) Set j1 := (k-1)/2, j2 := j1 (EOR) (k mod 2), i.e. 951.1Smycroft* j1 exclusive or with the l.s.b. of k. 961.1Smycroft* sgn1 := (-1)**j1, sgn2 := (-1)**j2. 971.1Smycroft* SIN(X) = sgn1 * cos(r) and COS(X) = sgn2*sin(r) where 981.1Smycroft* sin(r) and cos(r) are computed as odd and even polynomials 991.1Smycroft* in r, respectively. Exit 1001.1Smycroft* 1011.1Smycroft* 5. (k is even) Set j1 := k/2, sgn1 := (-1)**j1. 1021.1Smycroft* SIN(X) = sgn1 * sin(r) and COS(X) = sgn1*cos(r) where 1031.1Smycroft* sin(r) and cos(r) are computed as odd and even polynomials 1041.1Smycroft* in r, respectively. Exit 1051.1Smycroft* 1061.1Smycroft* 6. If |X| > 1, go to 8. 1071.1Smycroft* 1081.1Smycroft* 7. (|X|<2**(-40)) SIN(X) = X and COS(X) = 1. Exit. 1091.1Smycroft* 1101.1Smycroft* 8. Overwrite X by X := X rem 2Pi. Now that |X| <= Pi, go back to 2. 1111.1Smycroft* 1121.1Smycroft 1131.1SmycroftSSIN IDNT 2,1 Motorola 040 Floating Point Software Package 1141.1Smycroft 1151.1Smycroft section 8 1161.1Smycroft 1171.1Smycroft include fpsp.h 1181.1Smycroft 1191.1SmycroftBOUNDS1 DC.L $3FD78000,$4004BC7E 1201.1SmycroftTWOBYPI DC.L $3FE45F30,$6DC9C883 1211.1Smycroft 1221.1SmycroftSINA7 DC.L $BD6AAA77,$CCC994F5 1231.1SmycroftSINA6 DC.L $3DE61209,$7AAE8DA1 1241.1Smycroft 1251.1SmycroftSINA5 DC.L $BE5AE645,$2A118AE4 1261.1SmycroftSINA4 DC.L $3EC71DE3,$A5341531 1271.1Smycroft 1281.1SmycroftSINA3 DC.L $BF2A01A0,$1A018B59,$00000000,$00000000 1291.1Smycroft 1301.1SmycroftSINA2 DC.L $3FF80000,$88888888,$888859AF,$00000000 1311.1Smycroft 1321.1SmycroftSINA1 DC.L $BFFC0000,$AAAAAAAA,$AAAAAA99,$00000000 1331.1Smycroft 1341.1SmycroftCOSB8 DC.L $3D2AC4D0,$D6011EE3 1351.1SmycroftCOSB7 DC.L $BDA9396F,$9F45AC19 1361.1Smycroft 1371.1SmycroftCOSB6 DC.L $3E21EED9,$0612C972 1381.1SmycroftCOSB5 DC.L $BE927E4F,$B79D9FCF 1391.1Smycroft 1401.1SmycroftCOSB4 DC.L $3EFA01A0,$1A01D423,$00000000,$00000000 1411.1Smycroft 1421.1SmycroftCOSB3 DC.L $BFF50000,$B60B60B6,$0B61D438,$00000000 1431.1Smycroft 1441.1SmycroftCOSB2 DC.L $3FFA0000,$AAAAAAAA,$AAAAAB5E 1451.1SmycroftCOSB1 DC.L $BF000000 1461.1Smycroft 1471.1SmycroftINVTWOPI DC.L $3FFC0000,$A2F9836E,$4E44152A 1481.1Smycroft 1491.1SmycroftTWOPI1 DC.L $40010000,$C90FDAA2,$00000000,$00000000 1501.1SmycroftTWOPI2 DC.L $3FDF0000,$85A308D4,$00000000,$00000000 1511.1Smycroft 1521.1Smycroft xref PITBL 1531.1Smycroft 1541.1SmycroftINARG equ FP_SCR4 1551.1Smycroft 1561.1SmycroftX equ FP_SCR5 1571.1SmycroftXDCARE equ X+2 1581.1SmycroftXFRAC equ X+4 1591.1Smycroft 1601.1SmycroftRPRIME equ FP_SCR1 1611.1SmycroftSPRIME equ FP_SCR2 1621.1Smycroft 1631.1SmycroftPOSNEG1 equ L_SCR1 1641.1SmycroftTWOTO63 equ L_SCR1 1651.1Smycroft 1661.1SmycroftENDFLAG equ L_SCR2 1671.1SmycroftN equ L_SCR2 1681.1Smycroft 1691.1SmycroftADJN equ L_SCR3 1701.1Smycroft 1711.1Smycroft xref t_frcinx 1721.1Smycroft xref t_extdnrm 1731.1Smycroft xref sto_cos 1741.1Smycroft 1751.1Smycroft xdef ssind 1761.1Smycroftssind: 1771.1Smycroft*--SIN(X) = X FOR DENORMALIZED X 1781.1Smycroft bra t_extdnrm 1791.1Smycroft 1801.1Smycroft xdef scosd 1811.1Smycroftscosd: 1821.1Smycroft*--COS(X) = 1 FOR DENORMALIZED X 1831.1Smycroft 1841.1Smycroft FMOVE.S #:3F800000,FP0 1851.1Smycroft* 1861.1Smycroft* 9D25B Fix: Sometimes the previous fmove.s sets fpsr bits 1871.1Smycroft* 1881.1Smycroft fmove.l #0,fpsr 1891.1Smycroft* 1901.1Smycroft bra t_frcinx 1911.1Smycroft 1921.1Smycroft xdef ssin 1931.1Smycroftssin: 1941.1Smycroft*--SET ADJN TO 0 1951.1Smycroft MOVE.L #0,ADJN(a6) 1961.1Smycroft BRA.B SINBGN 1971.1Smycroft 1981.1Smycroft xdef scos 1991.1Smycroftscos: 2001.1Smycroft*--SET ADJN TO 1 2011.1Smycroft MOVE.L #1,ADJN(a6) 2021.1Smycroft 2031.1SmycroftSINBGN: 2041.1Smycroft*--SAVE FPCR, FP1. CHECK IF |X| IS TOO SMALL OR LARGE 2051.1Smycroft 2061.1Smycroft FMOVE.X (a0),FP0 ...LOAD INPUT 2071.1Smycroft 2081.1Smycroft MOVE.L (A0),D0 2091.1Smycroft MOVE.W 4(A0),D0 2101.1Smycroft FMOVE.X FP0,X(a6) 2111.1Smycroft ANDI.L #$7FFFFFFF,D0 ...COMPACTIFY X 2121.1Smycroft 2131.1Smycroft CMPI.L #$3FD78000,D0 ...|X| >= 2**(-40)? 2141.1Smycroft BGE.B SOK1 2151.1Smycroft BRA.W SINSM 2161.1Smycroft 2171.1SmycroftSOK1: 2181.1Smycroft CMPI.L #$4004BC7E,D0 ...|X| < 15 PI? 2191.1Smycroft BLT.B SINMAIN 2201.1Smycroft BRA.W REDUCEX 2211.1Smycroft 2221.1SmycroftSINMAIN: 2231.1Smycroft*--THIS IS THE USUAL CASE, |X| <= 15 PI. 2241.1Smycroft*--THE ARGUMENT REDUCTION IS DONE BY TABLE LOOK UP. 2251.1Smycroft FMOVE.X FP0,FP1 2261.1Smycroft FMUL.D TWOBYPI,FP1 ...X*2/PI 2271.1Smycroft 2281.1Smycroft*--HIDE THE NEXT THREE INSTRUCTIONS 2291.1Smycroft LEA PITBL+$200,A1 ...TABLE OF N*PI/2, N = -32,...,32 2301.1Smycroft 2311.1Smycroft 2321.1Smycroft*--FP1 IS NOW READY 2331.1Smycroft FMOVE.L FP1,N(a6) ...CONVERT TO INTEGER 2341.1Smycroft 2351.1Smycroft MOVE.L N(a6),D0 2361.1Smycroft ASL.L #4,D0 2371.1Smycroft ADDA.L D0,A1 ...A1 IS THE ADDRESS OF N*PIBY2 2381.1Smycroft* ...WHICH IS IN TWO PIECES Y1 & Y2 2391.1Smycroft 2401.1Smycroft FSUB.X (A1)+,FP0 ...X-Y1 2411.1Smycroft*--HIDE THE NEXT ONE 2421.1Smycroft FSUB.S (A1),FP0 ...FP0 IS R = (X-Y1)-Y2 2431.1Smycroft 2441.1SmycroftSINCONT: 2451.1Smycroft*--continuation from REDUCEX 2461.1Smycroft 2471.1Smycroft*--GET N+ADJN AND SEE IF SIN(R) OR COS(R) IS NEEDED 2481.1Smycroft MOVE.L N(a6),D0 2491.1Smycroft ADD.L ADJN(a6),D0 ...SEE IF D0 IS ODD OR EVEN 2501.1Smycroft ROR.L #1,D0 ...D0 WAS ODD IFF D0 IS NEGATIVE 2511.1Smycroft CMPI.L #0,D0 2521.1Smycroft BLT.W COSPOLY 2531.1Smycroft 2541.1SmycroftSINPOLY: 2551.1Smycroft*--LET J BE THE LEAST SIG. BIT OF D0, LET SGN := (-1)**J. 2561.1Smycroft*--THEN WE RETURN SGN*SIN(R). SGN*SIN(R) IS COMPUTED BY 2571.1Smycroft*--R' + R'*S*(A1 + S(A2 + S(A3 + S(A4 + ... + SA7)))), WHERE 2581.1Smycroft*--R' = SGN*R, S=R*R. THIS CAN BE REWRITTEN AS 2591.1Smycroft*--R' + R'*S*( [A1+T(A3+T(A5+TA7))] + [S(A2+T(A4+TA6))]) 2601.1Smycroft*--WHERE T=S*S. 2611.1Smycroft*--NOTE THAT A3 THROUGH A7 ARE STORED IN DOUBLE PRECISION 2621.1Smycroft*--WHILE A1 AND A2 ARE IN DOUBLE-EXTENDED FORMAT. 2631.1Smycroft FMOVE.X FP0,X(a6) ...X IS R 2641.1Smycroft FMUL.X FP0,FP0 ...FP0 IS S 2651.1Smycroft*---HIDE THE NEXT TWO WHILE WAITING FOR FP0 2661.1Smycroft FMOVE.D SINA7,FP3 2671.1Smycroft FMOVE.D SINA6,FP2 2681.1Smycroft*--FP0 IS NOW READY 2691.1Smycroft FMOVE.X FP0,FP1 2701.1Smycroft FMUL.X FP1,FP1 ...FP1 IS T 2711.1Smycroft*--HIDE THE NEXT TWO WHILE WAITING FOR FP1 2721.1Smycroft 2731.1Smycroft ROR.L #1,D0 2741.1Smycroft ANDI.L #$80000000,D0 2751.1Smycroft* ...LEAST SIG. BIT OF D0 IN SIGN POSITION 2761.1Smycroft EOR.L D0,X(a6) ...X IS NOW R'= SGN*R 2771.1Smycroft 2781.1Smycroft FMUL.X FP1,FP3 ...TA7 2791.1Smycroft FMUL.X FP1,FP2 ...TA6 2801.1Smycroft 2811.1Smycroft FADD.D SINA5,FP3 ...A5+TA7 2821.1Smycroft FADD.D SINA4,FP2 ...A4+TA6 2831.1Smycroft 2841.1Smycroft FMUL.X FP1,FP3 ...T(A5+TA7) 2851.1Smycroft FMUL.X FP1,FP2 ...T(A4+TA6) 2861.1Smycroft 2871.1Smycroft FADD.D SINA3,FP3 ...A3+T(A5+TA7) 2881.1Smycroft FADD.X SINA2,FP2 ...A2+T(A4+TA6) 2891.1Smycroft 2901.1Smycroft FMUL.X FP3,FP1 ...T(A3+T(A5+TA7)) 2911.1Smycroft 2921.1Smycroft FMUL.X FP0,FP2 ...S(A2+T(A4+TA6)) 2931.1Smycroft FADD.X SINA1,FP1 ...A1+T(A3+T(A5+TA7)) 2941.1Smycroft FMUL.X X(a6),FP0 ...R'*S 2951.1Smycroft 2961.1Smycroft FADD.X FP2,FP1 ...[A1+T(A3+T(A5+TA7))]+[S(A2+T(A4+TA6))] 2971.1Smycroft*--FP3 RELEASED, RESTORE NOW AND TAKE SOME ADVANTAGE OF HIDING 2981.1Smycroft*--FP2 RELEASED, RESTORE NOW AND TAKE FULL ADVANTAGE OF HIDING 2991.1Smycroft 3001.1Smycroft 3011.1Smycroft FMUL.X FP1,FP0 ...SIN(R')-R' 3021.1Smycroft*--FP1 RELEASED. 3031.1Smycroft 3041.1Smycroft FMOVE.L d1,FPCR ;restore users exceptions 3051.1Smycroft FADD.X X(a6),FP0 ;last inst - possible exception set 3061.1Smycroft bra t_frcinx 3071.1Smycroft 3081.1Smycroft 3091.1SmycroftCOSPOLY: 3101.1Smycroft*--LET J BE THE LEAST SIG. BIT OF D0, LET SGN := (-1)**J. 3111.1Smycroft*--THEN WE RETURN SGN*COS(R). SGN*COS(R) IS COMPUTED BY 3121.1Smycroft*--SGN + S'*(B1 + S(B2 + S(B3 + S(B4 + ... + SB8)))), WHERE 3131.1Smycroft*--S=R*R AND S'=SGN*S. THIS CAN BE REWRITTEN AS 3141.1Smycroft*--SGN + S'*([B1+T(B3+T(B5+TB7))] + [S(B2+T(B4+T(B6+TB8)))]) 3151.1Smycroft*--WHERE T=S*S. 3161.1Smycroft*--NOTE THAT B4 THROUGH B8 ARE STORED IN DOUBLE PRECISION 3171.1Smycroft*--WHILE B2 AND B3 ARE IN DOUBLE-EXTENDED FORMAT, B1 IS -1/2 3181.1Smycroft*--AND IS THEREFORE STORED AS SINGLE PRECISION. 3191.1Smycroft 3201.1Smycroft FMUL.X FP0,FP0 ...FP0 IS S 3211.1Smycroft*---HIDE THE NEXT TWO WHILE WAITING FOR FP0 3221.1Smycroft FMOVE.D COSB8,FP2 3231.1Smycroft FMOVE.D COSB7,FP3 3241.1Smycroft*--FP0 IS NOW READY 3251.1Smycroft FMOVE.X FP0,FP1 3261.1Smycroft FMUL.X FP1,FP1 ...FP1 IS T 3271.1Smycroft*--HIDE THE NEXT TWO WHILE WAITING FOR FP1 3281.1Smycroft FMOVE.X FP0,X(a6) ...X IS S 3291.1Smycroft ROR.L #1,D0 3301.1Smycroft ANDI.L #$80000000,D0 3311.1Smycroft* ...LEAST SIG. BIT OF D0 IN SIGN POSITION 3321.1Smycroft 3331.1Smycroft FMUL.X FP1,FP2 ...TB8 3341.1Smycroft*--HIDE THE NEXT TWO WHILE WAITING FOR THE XU 3351.1Smycroft EOR.L D0,X(a6) ...X IS NOW S'= SGN*S 3361.1Smycroft ANDI.L #$80000000,D0 3371.1Smycroft 3381.1Smycroft FMUL.X FP1,FP3 ...TB7 3391.1Smycroft*--HIDE THE NEXT TWO WHILE WAITING FOR THE XU 3401.1Smycroft ORI.L #$3F800000,D0 ...D0 IS SGN IN SINGLE 3411.1Smycroft MOVE.L D0,POSNEG1(a6) 3421.1Smycroft 3431.1Smycroft FADD.D COSB6,FP2 ...B6+TB8 3441.1Smycroft FADD.D COSB5,FP3 ...B5+TB7 3451.1Smycroft 3461.1Smycroft FMUL.X FP1,FP2 ...T(B6+TB8) 3471.1Smycroft FMUL.X FP1,FP3 ...T(B5+TB7) 3481.1Smycroft 3491.1Smycroft FADD.D COSB4,FP2 ...B4+T(B6+TB8) 3501.1Smycroft FADD.X COSB3,FP3 ...B3+T(B5+TB7) 3511.1Smycroft 3521.1Smycroft FMUL.X FP1,FP2 ...T(B4+T(B6+TB8)) 3531.1Smycroft FMUL.X FP3,FP1 ...T(B3+T(B5+TB7)) 3541.1Smycroft 3551.1Smycroft FADD.X COSB2,FP2 ...B2+T(B4+T(B6+TB8)) 3561.1Smycroft FADD.S COSB1,FP1 ...B1+T(B3+T(B5+TB7)) 3571.1Smycroft 3581.1Smycroft FMUL.X FP2,FP0 ...S(B2+T(B4+T(B6+TB8))) 3591.1Smycroft*--FP3 RELEASED, RESTORE NOW AND TAKE SOME ADVANTAGE OF HIDING 3601.1Smycroft*--FP2 RELEASED. 3611.1Smycroft 3621.1Smycroft 3631.1Smycroft FADD.X FP1,FP0 3641.1Smycroft*--FP1 RELEASED 3651.1Smycroft 3661.1Smycroft FMUL.X X(a6),FP0 3671.1Smycroft 3681.1Smycroft FMOVE.L d1,FPCR ;restore users exceptions 3691.1Smycroft FADD.S POSNEG1(a6),FP0 ;last inst - possible exception set 3701.1Smycroft bra t_frcinx 3711.1Smycroft 3721.1Smycroft 3731.1SmycroftSINBORS: 3741.1Smycroft*--IF |X| > 15PI, WE USE THE GENERAL ARGUMENT REDUCTION. 3751.1Smycroft*--IF |X| < 2**(-40), RETURN X OR 1. 3761.1Smycroft CMPI.L #$3FFF8000,D0 3771.1Smycroft BGT.B REDUCEX 3781.1Smycroft 3791.1Smycroft 3801.1SmycroftSINSM: 3811.1Smycroft MOVE.L ADJN(a6),D0 3821.1Smycroft CMPI.L #0,D0 3831.1Smycroft BGT.B COSTINY 3841.1Smycroft 3851.1SmycroftSINTINY: 3861.1Smycroft MOVE.W #$0000,XDCARE(a6) ...JUST IN CASE 3871.1Smycroft FMOVE.L d1,FPCR ;restore users exceptions 3881.1Smycroft FMOVE.X X(a6),FP0 ;last inst - possible exception set 3891.1Smycroft bra t_frcinx 3901.1Smycroft 3911.1Smycroft 3921.1SmycroftCOSTINY: 3931.1Smycroft FMOVE.S #:3F800000,FP0 3941.1Smycroft 3951.1Smycroft FMOVE.L d1,FPCR ;restore users exceptions 3961.1Smycroft FSUB.S #:00800000,FP0 ;last inst - possible exception set 3971.1Smycroft bra t_frcinx 3981.1Smycroft 3991.1Smycroft 4001.1SmycroftREDUCEX: 4011.1Smycroft*--WHEN REDUCEX IS USED, THE CODE WILL INEVITABLY BE SLOW. 4021.1Smycroft*--THIS REDUCTION METHOD, HOWEVER, IS MUCH FASTER THAN USING 4031.1Smycroft*--THE REMAINDER INSTRUCTION WHICH IS NOW IN SOFTWARE. 4041.1Smycroft 4051.1Smycroft FMOVEM.X FP2-FP5,-(A7) ...save FP2 through FP5 4061.1Smycroft MOVE.L D2,-(A7) 4071.1Smycroft FMOVE.S #:00000000,FP1 4081.1Smycroft*--If compact form of abs(arg) in d0=$7ffeffff, argument is so large that 4091.1Smycroft*--there is a danger of unwanted overflow in first LOOP iteration. In this 4101.1Smycroft*--case, reduce argument by one remainder step to make subsequent reduction 4111.1Smycroft*--safe. 4121.1Smycroft cmpi.l #$7ffeffff,d0 ;is argument dangerously large? 4131.1Smycroft bne.b LOOP 4141.1Smycroft move.l #$7ffe0000,FP_SCR2(a6) ;yes 4151.1Smycroft* ;create 2**16383*PI/2 4161.1Smycroft move.l #$c90fdaa2,FP_SCR2+4(a6) 4171.1Smycroft clr.l FP_SCR2+8(a6) 4181.1Smycroft ftst.x fp0 ;test sign of argument 4191.1Smycroft move.l #$7fdc0000,FP_SCR3(a6) ;create low half of 2**16383* 4201.1Smycroft* ;PI/2 at FP_SCR3 4211.1Smycroft move.l #$85a308d3,FP_SCR3+4(a6) 4221.1Smycroft clr.l FP_SCR3+8(a6) 4231.1Smycroft fblt.w red_neg 4241.1Smycroft or.w #$8000,FP_SCR2(a6) ;positive arg 4251.1Smycroft or.w #$8000,FP_SCR3(a6) 4261.1Smycroftred_neg: 4271.1Smycroft fadd.x FP_SCR2(a6),fp0 ;high part of reduction is exact 4281.1Smycroft fmove.x fp0,fp1 ;save high result in fp1 4291.1Smycroft fadd.x FP_SCR3(a6),fp0 ;low part of reduction 4301.1Smycroft fsub.x fp0,fp1 ;determine low component of result 4311.1Smycroft fadd.x FP_SCR3(a6),fp1 ;fp0/fp1 are reduced argument. 4321.1Smycroft 4331.1Smycroft*--ON ENTRY, FP0 IS X, ON RETURN, FP0 IS X REM PI/2, |X| <= PI/4. 4341.1Smycroft*--integer quotient will be stored in N 4351.1Smycroft*--Intermeditate remainder is 66-bit long; (R,r) in (FP0,FP1) 4361.1Smycroft 4371.1SmycroftLOOP: 4381.1Smycroft FMOVE.X FP0,INARG(a6) ...+-2**K * F, 1 <= F < 2 4391.1Smycroft MOVE.W INARG(a6),D0 4401.1Smycroft MOVE.L D0,A1 ...save a copy of D0 4411.1Smycroft ANDI.L #$00007FFF,D0 4421.1Smycroft SUBI.L #$00003FFF,D0 ...D0 IS K 4431.1Smycroft CMPI.L #28,D0 4441.1Smycroft BLE.B LASTLOOP 4451.1SmycroftCONTLOOP: 4461.1Smycroft SUBI.L #27,D0 ...D0 IS L := K-27 4471.1Smycroft MOVE.L #0,ENDFLAG(a6) 4481.1Smycroft BRA.B WORK 4491.1SmycroftLASTLOOP: 4501.1Smycroft CLR.L D0 ...D0 IS L := 0 4511.1Smycroft MOVE.L #1,ENDFLAG(a6) 4521.1Smycroft 4531.1SmycroftWORK: 4541.1Smycroft*--FIND THE REMAINDER OF (R,r) W.R.T. 2**L * (PI/2). L IS SO CHOSEN 4551.1Smycroft*--THAT INT( X * (2/PI) / 2**(L) ) < 2**29. 4561.1Smycroft 4571.1Smycroft*--CREATE 2**(-L) * (2/PI), SIGN(INARG)*2**(63), 4581.1Smycroft*--2**L * (PIby2_1), 2**L * (PIby2_2) 4591.1Smycroft 4601.1Smycroft MOVE.L #$00003FFE,D2 ...BIASED EXPO OF 2/PI 4611.1Smycroft SUB.L D0,D2 ...BIASED EXPO OF 2**(-L)*(2/PI) 4621.1Smycroft 4631.1Smycroft MOVE.L #$A2F9836E,FP_SCR1+4(a6) 4641.1Smycroft MOVE.L #$4E44152A,FP_SCR1+8(a6) 4651.1Smycroft MOVE.W D2,FP_SCR1(a6) ...FP_SCR1 is 2**(-L)*(2/PI) 4661.1Smycroft 4671.1Smycroft FMOVE.X FP0,FP2 4681.1Smycroft FMUL.X FP_SCR1(a6),FP2 4691.1Smycroft*--WE MUST NOW FIND INT(FP2). SINCE WE NEED THIS VALUE IN 4701.1Smycroft*--FLOATING POINT FORMAT, THE TWO FMOVE'S FMOVE.L FP <--> N 4711.1Smycroft*--WILL BE TOO INEFFICIENT. THE WAY AROUND IT IS THAT 4721.1Smycroft*--(SIGN(INARG)*2**63 + FP2) - SIGN(INARG)*2**63 WILL GIVE 4731.1Smycroft*--US THE DESIRED VALUE IN FLOATING POINT. 4741.1Smycroft 4751.1Smycroft*--HIDE SIX CYCLES OF INSTRUCTION 4761.1Smycroft MOVE.L A1,D2 4771.1Smycroft SWAP D2 4781.1Smycroft ANDI.L #$80000000,D2 4791.1Smycroft ORI.L #$5F000000,D2 ...D2 IS SIGN(INARG)*2**63 IN SGL 4801.1Smycroft MOVE.L D2,TWOTO63(a6) 4811.1Smycroft 4821.1Smycroft MOVE.L D0,D2 4831.1Smycroft ADDI.L #$00003FFF,D2 ...BIASED EXPO OF 2**L * (PI/2) 4841.1Smycroft 4851.1Smycroft*--FP2 IS READY 4861.1Smycroft FADD.S TWOTO63(a6),FP2 ...THE FRACTIONAL PART OF FP1 IS ROUNDED 4871.1Smycroft 4881.1Smycroft*--HIDE 4 CYCLES OF INSTRUCTION; creating 2**(L)*Piby2_1 and 2**(L)*Piby2_2 4891.1Smycroft MOVE.W D2,FP_SCR2(a6) 4901.1Smycroft CLR.W FP_SCR2+2(a6) 4911.1Smycroft MOVE.L #$C90FDAA2,FP_SCR2+4(a6) 4921.1Smycroft CLR.L FP_SCR2+8(a6) ...FP_SCR2 is 2**(L) * Piby2_1 4931.1Smycroft 4941.1Smycroft*--FP2 IS READY 4951.1Smycroft FSUB.S TWOTO63(a6),FP2 ...FP2 is N 4961.1Smycroft 4971.1Smycroft ADDI.L #$00003FDD,D0 4981.1Smycroft MOVE.W D0,FP_SCR3(a6) 4991.1Smycroft CLR.W FP_SCR3+2(a6) 5001.1Smycroft MOVE.L #$85A308D3,FP_SCR3+4(a6) 5011.1Smycroft CLR.L FP_SCR3+8(a6) ...FP_SCR3 is 2**(L) * Piby2_2 5021.1Smycroft 5031.1Smycroft MOVE.L ENDFLAG(a6),D0 5041.1Smycroft 5051.1Smycroft*--We are now ready to perform (R+r) - N*P1 - N*P2, P1 = 2**(L) * Piby2_1 and 5061.1Smycroft*--P2 = 2**(L) * Piby2_2 5071.1Smycroft FMOVE.X FP2,FP4 5081.1Smycroft FMul.X FP_SCR2(a6),FP4 ...W = N*P1 5091.1Smycroft FMove.X FP2,FP5 5101.1Smycroft FMul.X FP_SCR3(a6),FP5 ...w = N*P2 5111.1Smycroft FMove.X FP4,FP3 5121.1Smycroft*--we want P+p = W+w but |p| <= half ulp of P 5131.1Smycroft*--Then, we need to compute A := R-P and a := r-p 5141.1Smycroft FAdd.X FP5,FP3 ...FP3 is P 5151.1Smycroft FSub.X FP3,FP4 ...W-P 5161.1Smycroft 5171.1Smycroft FSub.X FP3,FP0 ...FP0 is A := R - P 5181.1Smycroft FAdd.X FP5,FP4 ...FP4 is p = (W-P)+w 5191.1Smycroft 5201.1Smycroft FMove.X FP0,FP3 ...FP3 A 5211.1Smycroft FSub.X FP4,FP1 ...FP1 is a := r - p 5221.1Smycroft 5231.1Smycroft*--Now we need to normalize (A,a) to "new (R,r)" where R+r = A+a but 5241.1Smycroft*--|r| <= half ulp of R. 5251.1Smycroft FAdd.X FP1,FP0 ...FP0 is R := A+a 5261.1Smycroft*--No need to calculate r if this is the last loop 5271.1Smycroft CMPI.L #0,D0 5281.1Smycroft BGT.W RESTORE 5291.1Smycroft 5301.1Smycroft*--Need to calculate r 5311.1Smycroft FSub.X FP0,FP3 ...A-R 5321.1Smycroft FAdd.X FP3,FP1 ...FP1 is r := (A-R)+a 5331.1Smycroft BRA.W LOOP 5341.1Smycroft 5351.1SmycroftRESTORE: 5361.1Smycroft FMOVE.L FP2,N(a6) 5371.1Smycroft MOVE.L (A7)+,D2 5381.1Smycroft FMOVEM.X (A7)+,FP2-FP5 5391.1Smycroft 5401.1Smycroft 5411.1Smycroft MOVE.L ADJN(a6),D0 5421.1Smycroft CMPI.L #4,D0 5431.1Smycroft 5441.1Smycroft BLT.W SINCONT 5451.1Smycroft BRA.B SCCONT 5461.1Smycroft 5471.1Smycroft xdef ssincosd 5481.1Smycroftssincosd: 5491.1Smycroft*--SIN AND COS OF X FOR DENORMALIZED X 5501.1Smycroft 5511.1Smycroft FMOVE.S #:3F800000,FP1 5521.1Smycroft bsr sto_cos ;store cosine result 5531.1Smycroft bra t_extdnrm 5541.1Smycroft 5551.1Smycroft xdef ssincos 5561.1Smycroftssincos: 5571.1Smycroft*--SET ADJN TO 4 5581.1Smycroft MOVE.L #4,ADJN(a6) 5591.1Smycroft 5601.1Smycroft FMOVE.X (a0),FP0 ...LOAD INPUT 5611.1Smycroft 5621.1Smycroft MOVE.L (A0),D0 5631.1Smycroft MOVE.W 4(A0),D0 5641.1Smycroft FMOVE.X FP0,X(a6) 5651.1Smycroft ANDI.L #$7FFFFFFF,D0 ...COMPACTIFY X 5661.1Smycroft 5671.1Smycroft CMPI.L #$3FD78000,D0 ...|X| >= 2**(-40)? 5681.1Smycroft BGE.B SCOK1 5691.1Smycroft BRA.W SCSM 5701.1Smycroft 5711.1SmycroftSCOK1: 5721.1Smycroft CMPI.L #$4004BC7E,D0 ...|X| < 15 PI? 5731.1Smycroft BLT.B SCMAIN 5741.1Smycroft BRA.W REDUCEX 5751.1Smycroft 5761.1Smycroft 5771.1SmycroftSCMAIN: 5781.1Smycroft*--THIS IS THE USUAL CASE, |X| <= 15 PI. 5791.1Smycroft*--THE ARGUMENT REDUCTION IS DONE BY TABLE LOOK UP. 5801.1Smycroft FMOVE.X FP0,FP1 5811.1Smycroft FMUL.D TWOBYPI,FP1 ...X*2/PI 5821.1Smycroft 5831.1Smycroft*--HIDE THE NEXT THREE INSTRUCTIONS 5841.1Smycroft LEA PITBL+$200,A1 ...TABLE OF N*PI/2, N = -32,...,32 5851.1Smycroft 5861.1Smycroft 5871.1Smycroft*--FP1 IS NOW READY 5881.1Smycroft FMOVE.L FP1,N(a6) ...CONVERT TO INTEGER 5891.1Smycroft 5901.1Smycroft MOVE.L N(a6),D0 5911.1Smycroft ASL.L #4,D0 5921.1Smycroft ADDA.L D0,A1 ...ADDRESS OF N*PIBY2, IN Y1, Y2 5931.1Smycroft 5941.1Smycroft FSUB.X (A1)+,FP0 ...X-Y1 5951.1Smycroft FSUB.S (A1),FP0 ...FP0 IS R = (X-Y1)-Y2 5961.1Smycroft 5971.1SmycroftSCCONT: 5981.1Smycroft*--continuation point from REDUCEX 5991.1Smycroft 6001.1Smycroft*--HIDE THE NEXT TWO 6011.1Smycroft MOVE.L N(a6),D0 6021.1Smycroft ROR.L #1,D0 6031.1Smycroft 6041.1Smycroft CMPI.L #0,D0 ...D0 < 0 IFF N IS ODD 6051.1Smycroft BGE.W NEVEN 6061.1Smycroft 6071.1SmycroftNODD: 6081.1Smycroft*--REGISTERS SAVED SO FAR: D0, A0, FP2. 6091.1Smycroft 6101.1Smycroft FMOVE.X FP0,RPRIME(a6) 6111.1Smycroft FMUL.X FP0,FP0 ...FP0 IS S = R*R 6121.1Smycroft FMOVE.D SINA7,FP1 ...A7 6131.1Smycroft FMOVE.D COSB8,FP2 ...B8 6141.1Smycroft FMUL.X FP0,FP1 ...SA7 6151.1Smycroft MOVE.L d2,-(A7) 6161.1Smycroft MOVE.L D0,d2 6171.1Smycroft FMUL.X FP0,FP2 ...SB8 6181.1Smycroft ROR.L #1,d2 6191.1Smycroft ANDI.L #$80000000,d2 6201.1Smycroft 6211.1Smycroft FADD.D SINA6,FP1 ...A6+SA7 6221.1Smycroft EOR.L D0,d2 6231.1Smycroft ANDI.L #$80000000,d2 6241.1Smycroft FADD.D COSB7,FP2 ...B7+SB8 6251.1Smycroft 6261.1Smycroft FMUL.X FP0,FP1 ...S(A6+SA7) 6271.1Smycroft EOR.L d2,RPRIME(a6) 6281.1Smycroft MOVE.L (A7)+,d2 6291.1Smycroft FMUL.X FP0,FP2 ...S(B7+SB8) 6301.1Smycroft ROR.L #1,D0 6311.1Smycroft ANDI.L #$80000000,D0 6321.1Smycroft 6331.1Smycroft FADD.D SINA5,FP1 ...A5+S(A6+SA7) 6341.1Smycroft MOVE.L #$3F800000,POSNEG1(a6) 6351.1Smycroft EOR.L D0,POSNEG1(a6) 6361.1Smycroft FADD.D COSB6,FP2 ...B6+S(B7+SB8) 6371.1Smycroft 6381.1Smycroft FMUL.X FP0,FP1 ...S(A5+S(A6+SA7)) 6391.1Smycroft FMUL.X FP0,FP2 ...S(B6+S(B7+SB8)) 6401.1Smycroft FMOVE.X FP0,SPRIME(a6) 6411.1Smycroft 6421.1Smycroft FADD.D SINA4,FP1 ...A4+S(A5+S(A6+SA7)) 6431.1Smycroft EOR.L D0,SPRIME(a6) 6441.1Smycroft FADD.D COSB5,FP2 ...B5+S(B6+S(B7+SB8)) 6451.1Smycroft 6461.1Smycroft FMUL.X FP0,FP1 ...S(A4+...) 6471.1Smycroft FMUL.X FP0,FP2 ...S(B5+...) 6481.1Smycroft 6491.1Smycroft FADD.D SINA3,FP1 ...A3+S(A4+...) 6501.1Smycroft FADD.D COSB4,FP2 ...B4+S(B5+...) 6511.1Smycroft 6521.1Smycroft FMUL.X FP0,FP1 ...S(A3+...) 6531.1Smycroft FMUL.X FP0,FP2 ...S(B4+...) 6541.1Smycroft 6551.1Smycroft FADD.X SINA2,FP1 ...A2+S(A3+...) 6561.1Smycroft FADD.X COSB3,FP2 ...B3+S(B4+...) 6571.1Smycroft 6581.1Smycroft FMUL.X FP0,FP1 ...S(A2+...) 6591.1Smycroft FMUL.X FP0,FP2 ...S(B3+...) 6601.1Smycroft 6611.1Smycroft FADD.X SINA1,FP1 ...A1+S(A2+...) 6621.1Smycroft FADD.X COSB2,FP2 ...B2+S(B3+...) 6631.1Smycroft 6641.1Smycroft FMUL.X FP0,FP1 ...S(A1+...) 6651.1Smycroft FMUL.X FP2,FP0 ...S(B2+...) 6661.1Smycroft 6671.1Smycroft 6681.1Smycroft 6691.1Smycroft FMUL.X RPRIME(a6),FP1 ...R'S(A1+...) 6701.1Smycroft FADD.S COSB1,FP0 ...B1+S(B2...) 6711.1Smycroft FMUL.X SPRIME(a6),FP0 ...S'(B1+S(B2+...)) 6721.1Smycroft 6731.1Smycroft move.l d1,-(sp) ;restore users mode & precision 6741.1Smycroft andi.l #$ff,d1 ;mask off all exceptions 6751.1Smycroft fmove.l d1,FPCR 6761.1Smycroft FADD.X RPRIME(a6),FP1 ...COS(X) 6771.1Smycroft bsr sto_cos ;store cosine result 6781.1Smycroft FMOVE.L (sp)+,FPCR ;restore users exceptions 6791.1Smycroft FADD.S POSNEG1(a6),FP0 ...SIN(X) 6801.1Smycroft 6811.1Smycroft bra t_frcinx 6821.1Smycroft 6831.1Smycroft 6841.1SmycroftNEVEN: 6851.1Smycroft*--REGISTERS SAVED SO FAR: FP2. 6861.1Smycroft 6871.1Smycroft FMOVE.X FP0,RPRIME(a6) 6881.1Smycroft FMUL.X FP0,FP0 ...FP0 IS S = R*R 6891.1Smycroft FMOVE.D COSB8,FP1 ...B8 6901.1Smycroft FMOVE.D SINA7,FP2 ...A7 6911.1Smycroft FMUL.X FP0,FP1 ...SB8 6921.1Smycroft FMOVE.X FP0,SPRIME(a6) 6931.1Smycroft FMUL.X FP0,FP2 ...SA7 6941.1Smycroft ROR.L #1,D0 6951.1Smycroft ANDI.L #$80000000,D0 6961.1Smycroft FADD.D COSB7,FP1 ...B7+SB8 6971.1Smycroft FADD.D SINA6,FP2 ...A6+SA7 6981.1Smycroft EOR.L D0,RPRIME(a6) 6991.1Smycroft EOR.L D0,SPRIME(a6) 7001.1Smycroft FMUL.X FP0,FP1 ...S(B7+SB8) 7011.1Smycroft ORI.L #$3F800000,D0 7021.1Smycroft MOVE.L D0,POSNEG1(a6) 7031.1Smycroft FMUL.X FP0,FP2 ...S(A6+SA7) 7041.1Smycroft 7051.1Smycroft FADD.D COSB6,FP1 ...B6+S(B7+SB8) 7061.1Smycroft FADD.D SINA5,FP2 ...A5+S(A6+SA7) 7071.1Smycroft 7081.1Smycroft FMUL.X FP0,FP1 ...S(B6+S(B7+SB8)) 7091.1Smycroft FMUL.X FP0,FP2 ...S(A5+S(A6+SA7)) 7101.1Smycroft 7111.1Smycroft FADD.D COSB5,FP1 ...B5+S(B6+S(B7+SB8)) 7121.1Smycroft FADD.D SINA4,FP2 ...A4+S(A5+S(A6+SA7)) 7131.1Smycroft 7141.1Smycroft FMUL.X FP0,FP1 ...S(B5+...) 7151.1Smycroft FMUL.X FP0,FP2 ...S(A4+...) 7161.1Smycroft 7171.1Smycroft FADD.D COSB4,FP1 ...B4+S(B5+...) 7181.1Smycroft FADD.D SINA3,FP2 ...A3+S(A4+...) 7191.1Smycroft 7201.1Smycroft FMUL.X FP0,FP1 ...S(B4+...) 7211.1Smycroft FMUL.X FP0,FP2 ...S(A3+...) 7221.1Smycroft 7231.1Smycroft FADD.X COSB3,FP1 ...B3+S(B4+...) 7241.1Smycroft FADD.X SINA2,FP2 ...A2+S(A3+...) 7251.1Smycroft 7261.1Smycroft FMUL.X FP0,FP1 ...S(B3+...) 7271.1Smycroft FMUL.X FP0,FP2 ...S(A2+...) 7281.1Smycroft 7291.1Smycroft FADD.X COSB2,FP1 ...B2+S(B3+...) 7301.1Smycroft FADD.X SINA1,FP2 ...A1+S(A2+...) 7311.1Smycroft 7321.1Smycroft FMUL.X FP0,FP1 ...S(B2+...) 7331.1Smycroft fmul.x fp2,fp0 ...s(a1+...) 7341.1Smycroft 7351.1Smycroft 7361.1Smycroft 7371.1Smycroft FADD.S COSB1,FP1 ...B1+S(B2...) 7381.1Smycroft FMUL.X RPRIME(a6),FP0 ...R'S(A1+...) 7391.1Smycroft FMUL.X SPRIME(a6),FP1 ...S'(B1+S(B2+...)) 7401.1Smycroft 7411.1Smycroft move.l d1,-(sp) ;save users mode & precision 7421.1Smycroft andi.l #$ff,d1 ;mask off all exceptions 7431.1Smycroft fmove.l d1,FPCR 7441.1Smycroft FADD.S POSNEG1(a6),FP1 ...COS(X) 7451.1Smycroft bsr sto_cos ;store cosine result 7461.1Smycroft FMOVE.L (sp)+,FPCR ;restore users exceptions 7471.1Smycroft FADD.X RPRIME(a6),FP0 ...SIN(X) 7481.1Smycroft 7491.1Smycroft bra t_frcinx 7501.1Smycroft 7511.1SmycroftSCBORS: 7521.1Smycroft CMPI.L #$3FFF8000,D0 7531.1Smycroft BGT.W REDUCEX 7541.1Smycroft 7551.1Smycroft 7561.1SmycroftSCSM: 7571.1Smycroft MOVE.W #$0000,XDCARE(a6) 7581.1Smycroft FMOVE.S #:3F800000,FP1 7591.1Smycroft 7601.1Smycroft move.l d1,-(sp) ;save users mode & precision 7611.1Smycroft andi.l #$ff,d1 ;mask off all exceptions 7621.1Smycroft fmove.l d1,FPCR 7631.1Smycroft FSUB.S #:00800000,FP1 7641.1Smycroft bsr sto_cos ;store cosine result 7651.1Smycroft FMOVE.L (sp)+,FPCR ;restore users exceptions 7661.1Smycroft FMOVE.X X(a6),FP0 7671.1Smycroft bra t_frcinx 7681.1Smycroft 7691.1Smycroft end 770