11.6Smsaitoh* $NetBSD: ssin.sa,v 1.6 2021/12/05 07:04:30 msaitoh Exp $ 21.3Scgd 31.1Smycroft* MOTOROLA MICROPROCESSOR & MEMORY TECHNOLOGY GROUP 41.1Smycroft* M68000 Hi-Performance Microprocessor Division 51.1Smycroft* M68040 Software Package 61.1Smycroft* 71.1Smycroft* M68040 Software Package Copyright (c) 1993, 1994 Motorola Inc. 81.1Smycroft* All rights reserved. 91.1Smycroft* 101.1Smycroft* THE SOFTWARE is provided on an "AS IS" basis and without warranty. 111.1Smycroft* To the maximum extent permitted by applicable law, 121.1Smycroft* MOTOROLA DISCLAIMS ALL WARRANTIES WHETHER EXPRESS OR IMPLIED, 131.1Smycroft* INCLUDING IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A 141.1Smycroft* PARTICULAR PURPOSE and any warranty against infringement with 151.1Smycroft* regard to the SOFTWARE (INCLUDING ANY MODIFIED VERSIONS THEREOF) 161.1Smycroft* and any accompanying written materials. 171.1Smycroft* 181.1Smycroft* To the maximum extent permitted by applicable law, 191.1Smycroft* IN NO EVENT SHALL MOTOROLA BE LIABLE FOR ANY DAMAGES WHATSOEVER 201.1Smycroft* (INCLUDING WITHOUT LIMITATION, DAMAGES FOR LOSS OF BUSINESS 211.1Smycroft* PROFITS, BUSINESS INTERRUPTION, LOSS OF BUSINESS INFORMATION, OR 221.1Smycroft* OTHER PECUNIARY LOSS) ARISING OF THE USE OR INABILITY TO USE THE 231.1Smycroft* SOFTWARE. Motorola assumes no responsibility for the maintenance 241.1Smycroft* and support of the SOFTWARE. 251.1Smycroft* 261.1Smycroft* You are hereby granted a copyright license to use, modify, and 271.1Smycroft* distribute the SOFTWARE so long as this entire notice is retained 281.1Smycroft* without alteration in any modified and/or redistributed versions, 291.1Smycroft* and that such modified versions are clearly identified as such. 301.1Smycroft* No licenses are granted by implication, estoppel or otherwise 311.1Smycroft* under any patents or trademarks of Motorola, Inc. 321.1Smycroft 331.1Smycroft* 341.1Smycroft* ssin.sa 3.3 7/29/91 351.1Smycroft* 361.1Smycroft* The entry point sSIN computes the sine of an input argument 371.1Smycroft* sCOS computes the cosine, and sSINCOS computes both. The 381.1Smycroft* corresponding entry points with a "d" computes the same 391.1Smycroft* corresponding function values for denormalized inputs. 401.1Smycroft* 411.1Smycroft* Input: Double-extended number X in location pointed to 421.1Smycroft* by address register a0. 431.1Smycroft* 441.5Sandvar* Output: The function value sin(X) or cos(X) returned in Fp0 if SIN or 451.1Smycroft* COS is requested. Otherwise, for SINCOS, sin(X) is returned 461.1Smycroft* in Fp0, and cos(X) is returned in Fp1. 471.1Smycroft* 481.1Smycroft* Modifies: Fp0 for SIN or COS; both Fp0 and Fp1 for SINCOS. 491.1Smycroft* 501.1Smycroft* Accuracy and Monotonicity: The returned result is within 1 ulp in 511.1Smycroft* 64 significant bit, i.e. within 0.5001 ulp to 53 bits if the 521.1Smycroft* result is subsequently rounded to double precision. The 531.1Smycroft* result is provably monotonic in double precision. 541.1Smycroft* 551.1Smycroft* Speed: The programs sSIN and sCOS take approximately 150 cycles for 561.4Ssoren* input argument X such that |X| < 15Pi, which is the usual 571.1Smycroft* situation. The speed for sSINCOS is approximately 190 cycles. 581.1Smycroft* 591.1Smycroft* Algorithm: 601.1Smycroft* 611.1Smycroft* SIN and COS: 621.1Smycroft* 1. If SIN is invoked, set AdjN := 0; otherwise, set AdjN := 1. 631.1Smycroft* 641.1Smycroft* 2. If |X| >= 15Pi or |X| < 2**(-40), go to 7. 651.1Smycroft* 661.1Smycroft* 3. Decompose X as X = N(Pi/2) + r where |r| <= Pi/4. Let 671.6Smsaitoh* k = N mod 4, so in particular, k = 0,1,2,or 3. Overwrite 681.1Smycroft* k by k := k + AdjN. 691.1Smycroft* 701.1Smycroft* 4. If k is even, go to 6. 711.1Smycroft* 721.1Smycroft* 5. (k is odd) Set j := (k-1)/2, sgn := (-1)**j. Return sgn*cos(r) 731.1Smycroft* where cos(r) is approximated by an even polynomial in r, 741.1Smycroft* 1 + r*r*(B1+s*(B2+ ... + s*B8)), s = r*r. 751.1Smycroft* Exit. 761.1Smycroft* 771.1Smycroft* 6. (k is even) Set j := k/2, sgn := (-1)**j. Return sgn*sin(r) 781.1Smycroft* where sin(r) is approximated by an odd polynomial in r 791.1Smycroft* r + r*s*(A1+s*(A2+ ... + s*A7)), s = r*r. 801.1Smycroft* Exit. 811.1Smycroft* 821.1Smycroft* 7. If |X| > 1, go to 9. 831.1Smycroft* 841.1Smycroft* 8. (|X|<2**(-40)) If SIN is invoked, return X; otherwise return 1. 851.1Smycroft* 861.1Smycroft* 9. Overwrite X by X := X rem 2Pi. Now that |X| <= Pi, go back to 3. 871.1Smycroft* 881.1Smycroft* SINCOS: 891.1Smycroft* 1. If |X| >= 15Pi or |X| < 2**(-40), go to 6. 901.1Smycroft* 911.1Smycroft* 2. Decompose X as X = N(Pi/2) + r where |r| <= Pi/4. Let 921.1Smycroft* k = N mod 4, so in particular, k = 0,1,2,or 3. 931.1Smycroft* 941.1Smycroft* 3. If k is even, go to 5. 951.1Smycroft* 961.1Smycroft* 4. (k is odd) Set j1 := (k-1)/2, j2 := j1 (EOR) (k mod 2), i.e. 971.1Smycroft* j1 exclusive or with the l.s.b. of k. 981.1Smycroft* sgn1 := (-1)**j1, sgn2 := (-1)**j2. 991.1Smycroft* SIN(X) = sgn1 * cos(r) and COS(X) = sgn2*sin(r) where 1001.1Smycroft* sin(r) and cos(r) are computed as odd and even polynomials 1011.1Smycroft* in r, respectively. Exit 1021.1Smycroft* 1031.1Smycroft* 5. (k is even) Set j1 := k/2, sgn1 := (-1)**j1. 1041.1Smycroft* SIN(X) = sgn1 * sin(r) and COS(X) = sgn1*cos(r) where 1051.1Smycroft* sin(r) and cos(r) are computed as odd and even polynomials 1061.1Smycroft* in r, respectively. Exit 1071.1Smycroft* 1081.1Smycroft* 6. If |X| > 1, go to 8. 1091.1Smycroft* 1101.1Smycroft* 7. (|X|<2**(-40)) SIN(X) = X and COS(X) = 1. Exit. 1111.1Smycroft* 1121.1Smycroft* 8. Overwrite X by X := X rem 2Pi. Now that |X| <= Pi, go back to 2. 1131.1Smycroft* 1141.1Smycroft 1151.1SmycroftSSIN IDNT 2,1 Motorola 040 Floating Point Software Package 1161.1Smycroft 1171.1Smycroft section 8 1181.1Smycroft 1191.1Smycroft include fpsp.h 1201.1Smycroft 1211.1SmycroftBOUNDS1 DC.L $3FD78000,$4004BC7E 1221.1SmycroftTWOBYPI DC.L $3FE45F30,$6DC9C883 1231.1Smycroft 1241.1SmycroftSINA7 DC.L $BD6AAA77,$CCC994F5 1251.1SmycroftSINA6 DC.L $3DE61209,$7AAE8DA1 1261.1Smycroft 1271.1SmycroftSINA5 DC.L $BE5AE645,$2A118AE4 1281.1SmycroftSINA4 DC.L $3EC71DE3,$A5341531 1291.1Smycroft 1301.1SmycroftSINA3 DC.L $BF2A01A0,$1A018B59,$00000000,$00000000 1311.1Smycroft 1321.1SmycroftSINA2 DC.L $3FF80000,$88888888,$888859AF,$00000000 1331.1Smycroft 1341.1SmycroftSINA1 DC.L $BFFC0000,$AAAAAAAA,$AAAAAA99,$00000000 1351.1Smycroft 1361.1SmycroftCOSB8 DC.L $3D2AC4D0,$D6011EE3 1371.1SmycroftCOSB7 DC.L $BDA9396F,$9F45AC19 1381.1Smycroft 1391.1SmycroftCOSB6 DC.L $3E21EED9,$0612C972 1401.1SmycroftCOSB5 DC.L $BE927E4F,$B79D9FCF 1411.1Smycroft 1421.1SmycroftCOSB4 DC.L $3EFA01A0,$1A01D423,$00000000,$00000000 1431.1Smycroft 1441.1SmycroftCOSB3 DC.L $BFF50000,$B60B60B6,$0B61D438,$00000000 1451.1Smycroft 1461.1SmycroftCOSB2 DC.L $3FFA0000,$AAAAAAAA,$AAAAAB5E 1471.1SmycroftCOSB1 DC.L $BF000000 1481.1Smycroft 1491.1SmycroftINVTWOPI DC.L $3FFC0000,$A2F9836E,$4E44152A 1501.1Smycroft 1511.1SmycroftTWOPI1 DC.L $40010000,$C90FDAA2,$00000000,$00000000 1521.1SmycroftTWOPI2 DC.L $3FDF0000,$85A308D4,$00000000,$00000000 1531.1Smycroft 1541.1Smycroft xref PITBL 1551.1Smycroft 1561.1SmycroftINARG equ FP_SCR4 1571.1Smycroft 1581.1SmycroftX equ FP_SCR5 1591.1SmycroftXDCARE equ X+2 1601.1SmycroftXFRAC equ X+4 1611.1Smycroft 1621.1SmycroftRPRIME equ FP_SCR1 1631.1SmycroftSPRIME equ FP_SCR2 1641.1Smycroft 1651.1SmycroftPOSNEG1 equ L_SCR1 1661.1SmycroftTWOTO63 equ L_SCR1 1671.1Smycroft 1681.1SmycroftENDFLAG equ L_SCR2 1691.1SmycroftN equ L_SCR2 1701.1Smycroft 1711.1SmycroftADJN equ L_SCR3 1721.1Smycroft 1731.1Smycroft xref t_frcinx 1741.1Smycroft xref t_extdnrm 1751.1Smycroft xref sto_cos 1761.1Smycroft 1771.1Smycroft xdef ssind 1781.1Smycroftssind: 1791.1Smycroft*--SIN(X) = X FOR DENORMALIZED X 1801.1Smycroft bra t_extdnrm 1811.1Smycroft 1821.1Smycroft xdef scosd 1831.1Smycroftscosd: 1841.1Smycroft*--COS(X) = 1 FOR DENORMALIZED X 1851.1Smycroft 1861.1Smycroft FMOVE.S #:3F800000,FP0 1871.1Smycroft* 1881.1Smycroft* 9D25B Fix: Sometimes the previous fmove.s sets fpsr bits 1891.1Smycroft* 1901.1Smycroft fmove.l #0,fpsr 1911.1Smycroft* 1921.1Smycroft bra t_frcinx 1931.1Smycroft 1941.1Smycroft xdef ssin 1951.1Smycroftssin: 1961.1Smycroft*--SET ADJN TO 0 1971.2Smycroft CLR.L ADJN(a6) 1981.1Smycroft BRA.B SINBGN 1991.1Smycroft 2001.1Smycroft xdef scos 2011.1Smycroftscos: 2021.1Smycroft*--SET ADJN TO 1 2031.1Smycroft MOVE.L #1,ADJN(a6) 2041.1Smycroft 2051.1SmycroftSINBGN: 2061.1Smycroft*--SAVE FPCR, FP1. CHECK IF |X| IS TOO SMALL OR LARGE 2071.1Smycroft 2081.1Smycroft FMOVE.X (a0),FP0 ...LOAD INPUT 2091.1Smycroft 2101.1Smycroft MOVE.L (A0),D0 2111.1Smycroft MOVE.W 4(A0),D0 2121.1Smycroft FMOVE.X FP0,X(a6) 2131.1Smycroft ANDI.L #$7FFFFFFF,D0 ...COMPACTIFY X 2141.1Smycroft 2151.1Smycroft CMPI.L #$3FD78000,D0 ...|X| >= 2**(-40)? 2161.1Smycroft BGE.B SOK1 2171.1Smycroft BRA.W SINSM 2181.1Smycroft 2191.1SmycroftSOK1: 2201.1Smycroft CMPI.L #$4004BC7E,D0 ...|X| < 15 PI? 2211.1Smycroft BLT.B SINMAIN 2221.1Smycroft BRA.W REDUCEX 2231.1Smycroft 2241.1SmycroftSINMAIN: 2251.1Smycroft*--THIS IS THE USUAL CASE, |X| <= 15 PI. 2261.1Smycroft*--THE ARGUMENT REDUCTION IS DONE BY TABLE LOOK UP. 2271.1Smycroft FMOVE.X FP0,FP1 2281.1Smycroft FMUL.D TWOBYPI,FP1 ...X*2/PI 2291.1Smycroft 2301.1Smycroft*--HIDE THE NEXT THREE INSTRUCTIONS 2311.1Smycroft LEA PITBL+$200,A1 ...TABLE OF N*PI/2, N = -32,...,32 2321.1Smycroft 2331.1Smycroft 2341.1Smycroft*--FP1 IS NOW READY 2351.1Smycroft FMOVE.L FP1,N(a6) ...CONVERT TO INTEGER 2361.1Smycroft 2371.1Smycroft MOVE.L N(a6),D0 2381.1Smycroft ASL.L #4,D0 2391.1Smycroft ADDA.L D0,A1 ...A1 IS THE ADDRESS OF N*PIBY2 2401.1Smycroft* ...WHICH IS IN TWO PIECES Y1 & Y2 2411.1Smycroft 2421.1Smycroft FSUB.X (A1)+,FP0 ...X-Y1 2431.1Smycroft*--HIDE THE NEXT ONE 2441.1Smycroft FSUB.S (A1),FP0 ...FP0 IS R = (X-Y1)-Y2 2451.1Smycroft 2461.1SmycroftSINCONT: 2471.1Smycroft*--continuation from REDUCEX 2481.1Smycroft 2491.1Smycroft*--GET N+ADJN AND SEE IF SIN(R) OR COS(R) IS NEEDED 2501.1Smycroft MOVE.L N(a6),D0 2511.1Smycroft ADD.L ADJN(a6),D0 ...SEE IF D0 IS ODD OR EVEN 2521.1Smycroft ROR.L #1,D0 ...D0 WAS ODD IFF D0 IS NEGATIVE 2531.2Smycroft TST.L D0 2541.1Smycroft BLT.W COSPOLY 2551.1Smycroft 2561.1SmycroftSINPOLY: 2571.1Smycroft*--LET J BE THE LEAST SIG. BIT OF D0, LET SGN := (-1)**J. 2581.1Smycroft*--THEN WE RETURN SGN*SIN(R). SGN*SIN(R) IS COMPUTED BY 2591.1Smycroft*--R' + R'*S*(A1 + S(A2 + S(A3 + S(A4 + ... + SA7)))), WHERE 2601.1Smycroft*--R' = SGN*R, S=R*R. THIS CAN BE REWRITTEN AS 2611.1Smycroft*--R' + R'*S*( [A1+T(A3+T(A5+TA7))] + [S(A2+T(A4+TA6))]) 2621.1Smycroft*--WHERE T=S*S. 2631.1Smycroft*--NOTE THAT A3 THROUGH A7 ARE STORED IN DOUBLE PRECISION 2641.1Smycroft*--WHILE A1 AND A2 ARE IN DOUBLE-EXTENDED FORMAT. 2651.1Smycroft FMOVE.X FP0,X(a6) ...X IS R 2661.1Smycroft FMUL.X FP0,FP0 ...FP0 IS S 2671.1Smycroft*---HIDE THE NEXT TWO WHILE WAITING FOR FP0 2681.1Smycroft FMOVE.D SINA7,FP3 2691.1Smycroft FMOVE.D SINA6,FP2 2701.1Smycroft*--FP0 IS NOW READY 2711.1Smycroft FMOVE.X FP0,FP1 2721.1Smycroft FMUL.X FP1,FP1 ...FP1 IS T 2731.1Smycroft*--HIDE THE NEXT TWO WHILE WAITING FOR FP1 2741.1Smycroft 2751.1Smycroft ROR.L #1,D0 2761.1Smycroft ANDI.L #$80000000,D0 2771.1Smycroft* ...LEAST SIG. BIT OF D0 IN SIGN POSITION 2781.1Smycroft EOR.L D0,X(a6) ...X IS NOW R'= SGN*R 2791.1Smycroft 2801.1Smycroft FMUL.X FP1,FP3 ...TA7 2811.1Smycroft FMUL.X FP1,FP2 ...TA6 2821.1Smycroft 2831.1Smycroft FADD.D SINA5,FP3 ...A5+TA7 2841.1Smycroft FADD.D SINA4,FP2 ...A4+TA6 2851.1Smycroft 2861.1Smycroft FMUL.X FP1,FP3 ...T(A5+TA7) 2871.1Smycroft FMUL.X FP1,FP2 ...T(A4+TA6) 2881.1Smycroft 2891.1Smycroft FADD.D SINA3,FP3 ...A3+T(A5+TA7) 2901.1Smycroft FADD.X SINA2,FP2 ...A2+T(A4+TA6) 2911.1Smycroft 2921.1Smycroft FMUL.X FP3,FP1 ...T(A3+T(A5+TA7)) 2931.1Smycroft 2941.1Smycroft FMUL.X FP0,FP2 ...S(A2+T(A4+TA6)) 2951.1Smycroft FADD.X SINA1,FP1 ...A1+T(A3+T(A5+TA7)) 2961.1Smycroft FMUL.X X(a6),FP0 ...R'*S 2971.1Smycroft 2981.1Smycroft FADD.X FP2,FP1 ...[A1+T(A3+T(A5+TA7))]+[S(A2+T(A4+TA6))] 2991.1Smycroft*--FP3 RELEASED, RESTORE NOW AND TAKE SOME ADVANTAGE OF HIDING 3001.1Smycroft*--FP2 RELEASED, RESTORE NOW AND TAKE FULL ADVANTAGE OF HIDING 3011.1Smycroft 3021.1Smycroft 3031.1Smycroft FMUL.X FP1,FP0 ...SIN(R')-R' 3041.1Smycroft*--FP1 RELEASED. 3051.1Smycroft 3061.1Smycroft FMOVE.L d1,FPCR ;restore users exceptions 3071.1Smycroft FADD.X X(a6),FP0 ;last inst - possible exception set 3081.1Smycroft bra t_frcinx 3091.1Smycroft 3101.1Smycroft 3111.1SmycroftCOSPOLY: 3121.1Smycroft*--LET J BE THE LEAST SIG. BIT OF D0, LET SGN := (-1)**J. 3131.1Smycroft*--THEN WE RETURN SGN*COS(R). SGN*COS(R) IS COMPUTED BY 3141.1Smycroft*--SGN + S'*(B1 + S(B2 + S(B3 + S(B4 + ... + SB8)))), WHERE 3151.1Smycroft*--S=R*R AND S'=SGN*S. THIS CAN BE REWRITTEN AS 3161.1Smycroft*--SGN + S'*([B1+T(B3+T(B5+TB7))] + [S(B2+T(B4+T(B6+TB8)))]) 3171.1Smycroft*--WHERE T=S*S. 3181.1Smycroft*--NOTE THAT B4 THROUGH B8 ARE STORED IN DOUBLE PRECISION 3191.1Smycroft*--WHILE B2 AND B3 ARE IN DOUBLE-EXTENDED FORMAT, B1 IS -1/2 3201.1Smycroft*--AND IS THEREFORE STORED AS SINGLE PRECISION. 3211.1Smycroft 3221.1Smycroft FMUL.X FP0,FP0 ...FP0 IS S 3231.1Smycroft*---HIDE THE NEXT TWO WHILE WAITING FOR FP0 3241.1Smycroft FMOVE.D COSB8,FP2 3251.1Smycroft FMOVE.D COSB7,FP3 3261.1Smycroft*--FP0 IS NOW READY 3271.1Smycroft FMOVE.X FP0,FP1 3281.1Smycroft FMUL.X FP1,FP1 ...FP1 IS T 3291.1Smycroft*--HIDE THE NEXT TWO WHILE WAITING FOR FP1 3301.1Smycroft FMOVE.X FP0,X(a6) ...X IS S 3311.1Smycroft ROR.L #1,D0 3321.1Smycroft ANDI.L #$80000000,D0 3331.1Smycroft* ...LEAST SIG. BIT OF D0 IN SIGN POSITION 3341.1Smycroft 3351.1Smycroft FMUL.X FP1,FP2 ...TB8 3361.1Smycroft*--HIDE THE NEXT TWO WHILE WAITING FOR THE XU 3371.1Smycroft EOR.L D0,X(a6) ...X IS NOW S'= SGN*S 3381.1Smycroft ANDI.L #$80000000,D0 3391.1Smycroft 3401.1Smycroft FMUL.X FP1,FP3 ...TB7 3411.1Smycroft*--HIDE THE NEXT TWO WHILE WAITING FOR THE XU 3421.1Smycroft ORI.L #$3F800000,D0 ...D0 IS SGN IN SINGLE 3431.1Smycroft MOVE.L D0,POSNEG1(a6) 3441.1Smycroft 3451.1Smycroft FADD.D COSB6,FP2 ...B6+TB8 3461.1Smycroft FADD.D COSB5,FP3 ...B5+TB7 3471.1Smycroft 3481.1Smycroft FMUL.X FP1,FP2 ...T(B6+TB8) 3491.1Smycroft FMUL.X FP1,FP3 ...T(B5+TB7) 3501.1Smycroft 3511.1Smycroft FADD.D COSB4,FP2 ...B4+T(B6+TB8) 3521.1Smycroft FADD.X COSB3,FP3 ...B3+T(B5+TB7) 3531.1Smycroft 3541.1Smycroft FMUL.X FP1,FP2 ...T(B4+T(B6+TB8)) 3551.1Smycroft FMUL.X FP3,FP1 ...T(B3+T(B5+TB7)) 3561.1Smycroft 3571.1Smycroft FADD.X COSB2,FP2 ...B2+T(B4+T(B6+TB8)) 3581.1Smycroft FADD.S COSB1,FP1 ...B1+T(B3+T(B5+TB7)) 3591.1Smycroft 3601.1Smycroft FMUL.X FP2,FP0 ...S(B2+T(B4+T(B6+TB8))) 3611.1Smycroft*--FP3 RELEASED, RESTORE NOW AND TAKE SOME ADVANTAGE OF HIDING 3621.1Smycroft*--FP2 RELEASED. 3631.1Smycroft 3641.1Smycroft 3651.1Smycroft FADD.X FP1,FP0 3661.1Smycroft*--FP1 RELEASED 3671.1Smycroft 3681.1Smycroft FMUL.X X(a6),FP0 3691.1Smycroft 3701.1Smycroft FMOVE.L d1,FPCR ;restore users exceptions 3711.1Smycroft FADD.S POSNEG1(a6),FP0 ;last inst - possible exception set 3721.1Smycroft bra t_frcinx 3731.1Smycroft 3741.1Smycroft 3751.1SmycroftSINBORS: 3761.1Smycroft*--IF |X| > 15PI, WE USE THE GENERAL ARGUMENT REDUCTION. 3771.1Smycroft*--IF |X| < 2**(-40), RETURN X OR 1. 3781.1Smycroft CMPI.L #$3FFF8000,D0 3791.1Smycroft BGT.B REDUCEX 3801.1Smycroft 3811.1Smycroft 3821.1SmycroftSINSM: 3831.1Smycroft MOVE.L ADJN(a6),D0 3841.2Smycroft TST.L D0 3851.1Smycroft BGT.B COSTINY 3861.1Smycroft 3871.1SmycroftSINTINY: 3881.2Smycroft CLR.W XDCARE(a6) ...JUST IN CASE 3891.1Smycroft FMOVE.L d1,FPCR ;restore users exceptions 3901.1Smycroft FMOVE.X X(a6),FP0 ;last inst - possible exception set 3911.1Smycroft bra t_frcinx 3921.1Smycroft 3931.1Smycroft 3941.1SmycroftCOSTINY: 3951.1Smycroft FMOVE.S #:3F800000,FP0 3961.1Smycroft 3971.1Smycroft FMOVE.L d1,FPCR ;restore users exceptions 3981.1Smycroft FSUB.S #:00800000,FP0 ;last inst - possible exception set 3991.1Smycroft bra t_frcinx 4001.1Smycroft 4011.1Smycroft 4021.1SmycroftREDUCEX: 4031.1Smycroft*--WHEN REDUCEX IS USED, THE CODE WILL INEVITABLY BE SLOW. 4041.1Smycroft*--THIS REDUCTION METHOD, HOWEVER, IS MUCH FASTER THAN USING 4051.1Smycroft*--THE REMAINDER INSTRUCTION WHICH IS NOW IN SOFTWARE. 4061.1Smycroft 4071.1Smycroft FMOVEM.X FP2-FP5,-(A7) ...save FP2 through FP5 4081.1Smycroft MOVE.L D2,-(A7) 4091.1Smycroft FMOVE.S #:00000000,FP1 4101.1Smycroft*--If compact form of abs(arg) in d0=$7ffeffff, argument is so large that 4111.1Smycroft*--there is a danger of unwanted overflow in first LOOP iteration. In this 4121.1Smycroft*--case, reduce argument by one remainder step to make subsequent reduction 4131.1Smycroft*--safe. 4141.1Smycroft cmpi.l #$7ffeffff,d0 ;is argument dangerously large? 4151.1Smycroft bne.b LOOP 4161.1Smycroft move.l #$7ffe0000,FP_SCR2(a6) ;yes 4171.1Smycroft* ;create 2**16383*PI/2 4181.1Smycroft move.l #$c90fdaa2,FP_SCR2+4(a6) 4191.1Smycroft clr.l FP_SCR2+8(a6) 4201.1Smycroft ftst.x fp0 ;test sign of argument 4211.1Smycroft move.l #$7fdc0000,FP_SCR3(a6) ;create low half of 2**16383* 4221.1Smycroft* ;PI/2 at FP_SCR3 4231.1Smycroft move.l #$85a308d3,FP_SCR3+4(a6) 4241.1Smycroft clr.l FP_SCR3+8(a6) 4251.1Smycroft fblt.w red_neg 4261.1Smycroft or.w #$8000,FP_SCR2(a6) ;positive arg 4271.1Smycroft or.w #$8000,FP_SCR3(a6) 4281.1Smycroftred_neg: 4291.1Smycroft fadd.x FP_SCR2(a6),fp0 ;high part of reduction is exact 4301.1Smycroft fmove.x fp0,fp1 ;save high result in fp1 4311.1Smycroft fadd.x FP_SCR3(a6),fp0 ;low part of reduction 4321.1Smycroft fsub.x fp0,fp1 ;determine low component of result 4331.1Smycroft fadd.x FP_SCR3(a6),fp1 ;fp0/fp1 are reduced argument. 4341.1Smycroft 4351.1Smycroft*--ON ENTRY, FP0 IS X, ON RETURN, FP0 IS X REM PI/2, |X| <= PI/4. 4361.1Smycroft*--integer quotient will be stored in N 4371.1Smycroft*--Intermeditate remainder is 66-bit long; (R,r) in (FP0,FP1) 4381.1Smycroft 4391.1SmycroftLOOP: 4401.1Smycroft FMOVE.X FP0,INARG(a6) ...+-2**K * F, 1 <= F < 2 4411.1Smycroft MOVE.W INARG(a6),D0 4421.1Smycroft MOVE.L D0,A1 ...save a copy of D0 4431.1Smycroft ANDI.L #$00007FFF,D0 4441.1Smycroft SUBI.L #$00003FFF,D0 ...D0 IS K 4451.1Smycroft CMPI.L #28,D0 4461.1Smycroft BLE.B LASTLOOP 4471.1SmycroftCONTLOOP: 4481.1Smycroft SUBI.L #27,D0 ...D0 IS L := K-27 4491.2Smycroft CLR.L ENDFLAG(a6) 4501.1Smycroft BRA.B WORK 4511.1SmycroftLASTLOOP: 4521.1Smycroft CLR.L D0 ...D0 IS L := 0 4531.1Smycroft MOVE.L #1,ENDFLAG(a6) 4541.1Smycroft 4551.1SmycroftWORK: 4561.1Smycroft*--FIND THE REMAINDER OF (R,r) W.R.T. 2**L * (PI/2). L IS SO CHOSEN 4571.1Smycroft*--THAT INT( X * (2/PI) / 2**(L) ) < 2**29. 4581.1Smycroft 4591.1Smycroft*--CREATE 2**(-L) * (2/PI), SIGN(INARG)*2**(63), 4601.1Smycroft*--2**L * (PIby2_1), 2**L * (PIby2_2) 4611.1Smycroft 4621.1Smycroft MOVE.L #$00003FFE,D2 ...BIASED EXPO OF 2/PI 4631.1Smycroft SUB.L D0,D2 ...BIASED EXPO OF 2**(-L)*(2/PI) 4641.1Smycroft 4651.1Smycroft MOVE.L #$A2F9836E,FP_SCR1+4(a6) 4661.1Smycroft MOVE.L #$4E44152A,FP_SCR1+8(a6) 4671.1Smycroft MOVE.W D2,FP_SCR1(a6) ...FP_SCR1 is 2**(-L)*(2/PI) 4681.1Smycroft 4691.1Smycroft FMOVE.X FP0,FP2 4701.1Smycroft FMUL.X FP_SCR1(a6),FP2 4711.1Smycroft*--WE MUST NOW FIND INT(FP2). SINCE WE NEED THIS VALUE IN 4721.1Smycroft*--FLOATING POINT FORMAT, THE TWO FMOVE'S FMOVE.L FP <--> N 4731.1Smycroft*--WILL BE TOO INEFFICIENT. THE WAY AROUND IT IS THAT 4741.1Smycroft*--(SIGN(INARG)*2**63 + FP2) - SIGN(INARG)*2**63 WILL GIVE 4751.1Smycroft*--US THE DESIRED VALUE IN FLOATING POINT. 4761.1Smycroft 4771.1Smycroft*--HIDE SIX CYCLES OF INSTRUCTION 4781.1Smycroft MOVE.L A1,D2 4791.1Smycroft SWAP D2 4801.1Smycroft ANDI.L #$80000000,D2 4811.1Smycroft ORI.L #$5F000000,D2 ...D2 IS SIGN(INARG)*2**63 IN SGL 4821.1Smycroft MOVE.L D2,TWOTO63(a6) 4831.1Smycroft 4841.1Smycroft MOVE.L D0,D2 4851.1Smycroft ADDI.L #$00003FFF,D2 ...BIASED EXPO OF 2**L * (PI/2) 4861.1Smycroft 4871.1Smycroft*--FP2 IS READY 4881.1Smycroft FADD.S TWOTO63(a6),FP2 ...THE FRACTIONAL PART OF FP1 IS ROUNDED 4891.1Smycroft 4901.1Smycroft*--HIDE 4 CYCLES OF INSTRUCTION; creating 2**(L)*Piby2_1 and 2**(L)*Piby2_2 4911.1Smycroft MOVE.W D2,FP_SCR2(a6) 4921.1Smycroft CLR.W FP_SCR2+2(a6) 4931.1Smycroft MOVE.L #$C90FDAA2,FP_SCR2+4(a6) 4941.1Smycroft CLR.L FP_SCR2+8(a6) ...FP_SCR2 is 2**(L) * Piby2_1 4951.1Smycroft 4961.1Smycroft*--FP2 IS READY 4971.1Smycroft FSUB.S TWOTO63(a6),FP2 ...FP2 is N 4981.1Smycroft 4991.1Smycroft ADDI.L #$00003FDD,D0 5001.1Smycroft MOVE.W D0,FP_SCR3(a6) 5011.1Smycroft CLR.W FP_SCR3+2(a6) 5021.1Smycroft MOVE.L #$85A308D3,FP_SCR3+4(a6) 5031.1Smycroft CLR.L FP_SCR3+8(a6) ...FP_SCR3 is 2**(L) * Piby2_2 5041.1Smycroft 5051.1Smycroft MOVE.L ENDFLAG(a6),D0 5061.1Smycroft 5071.1Smycroft*--We are now ready to perform (R+r) - N*P1 - N*P2, P1 = 2**(L) * Piby2_1 and 5081.1Smycroft*--P2 = 2**(L) * Piby2_2 5091.1Smycroft FMOVE.X FP2,FP4 5101.1Smycroft FMul.X FP_SCR2(a6),FP4 ...W = N*P1 5111.1Smycroft FMove.X FP2,FP5 5121.1Smycroft FMul.X FP_SCR3(a6),FP5 ...w = N*P2 5131.1Smycroft FMove.X FP4,FP3 5141.1Smycroft*--we want P+p = W+w but |p| <= half ulp of P 5151.1Smycroft*--Then, we need to compute A := R-P and a := r-p 5161.1Smycroft FAdd.X FP5,FP3 ...FP3 is P 5171.1Smycroft FSub.X FP3,FP4 ...W-P 5181.1Smycroft 5191.1Smycroft FSub.X FP3,FP0 ...FP0 is A := R - P 5201.1Smycroft FAdd.X FP5,FP4 ...FP4 is p = (W-P)+w 5211.1Smycroft 5221.1Smycroft FMove.X FP0,FP3 ...FP3 A 5231.1Smycroft FSub.X FP4,FP1 ...FP1 is a := r - p 5241.1Smycroft 5251.1Smycroft*--Now we need to normalize (A,a) to "new (R,r)" where R+r = A+a but 5261.1Smycroft*--|r| <= half ulp of R. 5271.1Smycroft FAdd.X FP1,FP0 ...FP0 is R := A+a 5281.1Smycroft*--No need to calculate r if this is the last loop 5291.2Smycroft TST.L D0 5301.1Smycroft BGT.W RESTORE 5311.1Smycroft 5321.1Smycroft*--Need to calculate r 5331.1Smycroft FSub.X FP0,FP3 ...A-R 5341.1Smycroft FAdd.X FP3,FP1 ...FP1 is r := (A-R)+a 5351.1Smycroft BRA.W LOOP 5361.1Smycroft 5371.1SmycroftRESTORE: 5381.1Smycroft FMOVE.L FP2,N(a6) 5391.1Smycroft MOVE.L (A7)+,D2 5401.1Smycroft FMOVEM.X (A7)+,FP2-FP5 5411.1Smycroft 5421.1Smycroft 5431.1Smycroft MOVE.L ADJN(a6),D0 5441.1Smycroft CMPI.L #4,D0 5451.1Smycroft 5461.1Smycroft BLT.W SINCONT 5471.1Smycroft BRA.B SCCONT 5481.1Smycroft 5491.1Smycroft xdef ssincosd 5501.1Smycroftssincosd: 5511.1Smycroft*--SIN AND COS OF X FOR DENORMALIZED X 5521.1Smycroft 5531.1Smycroft FMOVE.S #:3F800000,FP1 5541.1Smycroft bsr sto_cos ;store cosine result 5551.1Smycroft bra t_extdnrm 5561.1Smycroft 5571.1Smycroft xdef ssincos 5581.1Smycroftssincos: 5591.1Smycroft*--SET ADJN TO 4 5601.1Smycroft MOVE.L #4,ADJN(a6) 5611.1Smycroft 5621.1Smycroft FMOVE.X (a0),FP0 ...LOAD INPUT 5631.1Smycroft 5641.1Smycroft MOVE.L (A0),D0 5651.1Smycroft MOVE.W 4(A0),D0 5661.1Smycroft FMOVE.X FP0,X(a6) 5671.1Smycroft ANDI.L #$7FFFFFFF,D0 ...COMPACTIFY X 5681.1Smycroft 5691.1Smycroft CMPI.L #$3FD78000,D0 ...|X| >= 2**(-40)? 5701.1Smycroft BGE.B SCOK1 5711.1Smycroft BRA.W SCSM 5721.1Smycroft 5731.1SmycroftSCOK1: 5741.1Smycroft CMPI.L #$4004BC7E,D0 ...|X| < 15 PI? 5751.1Smycroft BLT.B SCMAIN 5761.1Smycroft BRA.W REDUCEX 5771.1Smycroft 5781.1Smycroft 5791.1SmycroftSCMAIN: 5801.1Smycroft*--THIS IS THE USUAL CASE, |X| <= 15 PI. 5811.1Smycroft*--THE ARGUMENT REDUCTION IS DONE BY TABLE LOOK UP. 5821.1Smycroft FMOVE.X FP0,FP1 5831.1Smycroft FMUL.D TWOBYPI,FP1 ...X*2/PI 5841.1Smycroft 5851.1Smycroft*--HIDE THE NEXT THREE INSTRUCTIONS 5861.1Smycroft LEA PITBL+$200,A1 ...TABLE OF N*PI/2, N = -32,...,32 5871.1Smycroft 5881.1Smycroft 5891.1Smycroft*--FP1 IS NOW READY 5901.1Smycroft FMOVE.L FP1,N(a6) ...CONVERT TO INTEGER 5911.1Smycroft 5921.1Smycroft MOVE.L N(a6),D0 5931.1Smycroft ASL.L #4,D0 5941.1Smycroft ADDA.L D0,A1 ...ADDRESS OF N*PIBY2, IN Y1, Y2 5951.1Smycroft 5961.1Smycroft FSUB.X (A1)+,FP0 ...X-Y1 5971.1Smycroft FSUB.S (A1),FP0 ...FP0 IS R = (X-Y1)-Y2 5981.1Smycroft 5991.1SmycroftSCCONT: 6001.1Smycroft*--continuation point from REDUCEX 6011.1Smycroft 6021.1Smycroft*--HIDE THE NEXT TWO 6031.1Smycroft MOVE.L N(a6),D0 6041.1Smycroft ROR.L #1,D0 6051.1Smycroft 6061.2Smycroft TST.L D0 ...D0 < 0 IFF N IS ODD 6071.1Smycroft BGE.W NEVEN 6081.1Smycroft 6091.1SmycroftNODD: 6101.1Smycroft*--REGISTERS SAVED SO FAR: D0, A0, FP2. 6111.1Smycroft 6121.1Smycroft FMOVE.X FP0,RPRIME(a6) 6131.1Smycroft FMUL.X FP0,FP0 ...FP0 IS S = R*R 6141.1Smycroft FMOVE.D SINA7,FP1 ...A7 6151.1Smycroft FMOVE.D COSB8,FP2 ...B8 6161.1Smycroft FMUL.X FP0,FP1 ...SA7 6171.1Smycroft MOVE.L d2,-(A7) 6181.1Smycroft MOVE.L D0,d2 6191.1Smycroft FMUL.X FP0,FP2 ...SB8 6201.1Smycroft ROR.L #1,d2 6211.1Smycroft ANDI.L #$80000000,d2 6221.1Smycroft 6231.1Smycroft FADD.D SINA6,FP1 ...A6+SA7 6241.1Smycroft EOR.L D0,d2 6251.1Smycroft ANDI.L #$80000000,d2 6261.1Smycroft FADD.D COSB7,FP2 ...B7+SB8 6271.1Smycroft 6281.1Smycroft FMUL.X FP0,FP1 ...S(A6+SA7) 6291.1Smycroft EOR.L d2,RPRIME(a6) 6301.1Smycroft MOVE.L (A7)+,d2 6311.1Smycroft FMUL.X FP0,FP2 ...S(B7+SB8) 6321.1Smycroft ROR.L #1,D0 6331.1Smycroft ANDI.L #$80000000,D0 6341.1Smycroft 6351.1Smycroft FADD.D SINA5,FP1 ...A5+S(A6+SA7) 6361.1Smycroft MOVE.L #$3F800000,POSNEG1(a6) 6371.1Smycroft EOR.L D0,POSNEG1(a6) 6381.1Smycroft FADD.D COSB6,FP2 ...B6+S(B7+SB8) 6391.1Smycroft 6401.1Smycroft FMUL.X FP0,FP1 ...S(A5+S(A6+SA7)) 6411.1Smycroft FMUL.X FP0,FP2 ...S(B6+S(B7+SB8)) 6421.1Smycroft FMOVE.X FP0,SPRIME(a6) 6431.1Smycroft 6441.1Smycroft FADD.D SINA4,FP1 ...A4+S(A5+S(A6+SA7)) 6451.1Smycroft EOR.L D0,SPRIME(a6) 6461.1Smycroft FADD.D COSB5,FP2 ...B5+S(B6+S(B7+SB8)) 6471.1Smycroft 6481.1Smycroft FMUL.X FP0,FP1 ...S(A4+...) 6491.1Smycroft FMUL.X FP0,FP2 ...S(B5+...) 6501.1Smycroft 6511.1Smycroft FADD.D SINA3,FP1 ...A3+S(A4+...) 6521.1Smycroft FADD.D COSB4,FP2 ...B4+S(B5+...) 6531.1Smycroft 6541.1Smycroft FMUL.X FP0,FP1 ...S(A3+...) 6551.1Smycroft FMUL.X FP0,FP2 ...S(B4+...) 6561.1Smycroft 6571.1Smycroft FADD.X SINA2,FP1 ...A2+S(A3+...) 6581.1Smycroft FADD.X COSB3,FP2 ...B3+S(B4+...) 6591.1Smycroft 6601.1Smycroft FMUL.X FP0,FP1 ...S(A2+...) 6611.1Smycroft FMUL.X FP0,FP2 ...S(B3+...) 6621.1Smycroft 6631.1Smycroft FADD.X SINA1,FP1 ...A1+S(A2+...) 6641.1Smycroft FADD.X COSB2,FP2 ...B2+S(B3+...) 6651.1Smycroft 6661.1Smycroft FMUL.X FP0,FP1 ...S(A1+...) 6671.1Smycroft FMUL.X FP2,FP0 ...S(B2+...) 6681.1Smycroft 6691.1Smycroft 6701.1Smycroft 6711.1Smycroft FMUL.X RPRIME(a6),FP1 ...R'S(A1+...) 6721.1Smycroft FADD.S COSB1,FP0 ...B1+S(B2...) 6731.1Smycroft FMUL.X SPRIME(a6),FP0 ...S'(B1+S(B2+...)) 6741.1Smycroft 6751.1Smycroft move.l d1,-(sp) ;restore users mode & precision 6761.1Smycroft andi.l #$ff,d1 ;mask off all exceptions 6771.1Smycroft fmove.l d1,FPCR 6781.1Smycroft FADD.X RPRIME(a6),FP1 ...COS(X) 6791.1Smycroft bsr sto_cos ;store cosine result 6801.1Smycroft FMOVE.L (sp)+,FPCR ;restore users exceptions 6811.1Smycroft FADD.S POSNEG1(a6),FP0 ...SIN(X) 6821.1Smycroft 6831.1Smycroft bra t_frcinx 6841.1Smycroft 6851.1Smycroft 6861.1SmycroftNEVEN: 6871.1Smycroft*--REGISTERS SAVED SO FAR: FP2. 6881.1Smycroft 6891.1Smycroft FMOVE.X FP0,RPRIME(a6) 6901.1Smycroft FMUL.X FP0,FP0 ...FP0 IS S = R*R 6911.1Smycroft FMOVE.D COSB8,FP1 ...B8 6921.1Smycroft FMOVE.D SINA7,FP2 ...A7 6931.1Smycroft FMUL.X FP0,FP1 ...SB8 6941.1Smycroft FMOVE.X FP0,SPRIME(a6) 6951.1Smycroft FMUL.X FP0,FP2 ...SA7 6961.1Smycroft ROR.L #1,D0 6971.1Smycroft ANDI.L #$80000000,D0 6981.1Smycroft FADD.D COSB7,FP1 ...B7+SB8 6991.1Smycroft FADD.D SINA6,FP2 ...A6+SA7 7001.1Smycroft EOR.L D0,RPRIME(a6) 7011.1Smycroft EOR.L D0,SPRIME(a6) 7021.1Smycroft FMUL.X FP0,FP1 ...S(B7+SB8) 7031.1Smycroft ORI.L #$3F800000,D0 7041.1Smycroft MOVE.L D0,POSNEG1(a6) 7051.1Smycroft FMUL.X FP0,FP2 ...S(A6+SA7) 7061.1Smycroft 7071.1Smycroft FADD.D COSB6,FP1 ...B6+S(B7+SB8) 7081.1Smycroft FADD.D SINA5,FP2 ...A5+S(A6+SA7) 7091.1Smycroft 7101.1Smycroft FMUL.X FP0,FP1 ...S(B6+S(B7+SB8)) 7111.1Smycroft FMUL.X FP0,FP2 ...S(A5+S(A6+SA7)) 7121.1Smycroft 7131.1Smycroft FADD.D COSB5,FP1 ...B5+S(B6+S(B7+SB8)) 7141.1Smycroft FADD.D SINA4,FP2 ...A4+S(A5+S(A6+SA7)) 7151.1Smycroft 7161.1Smycroft FMUL.X FP0,FP1 ...S(B5+...) 7171.1Smycroft FMUL.X FP0,FP2 ...S(A4+...) 7181.1Smycroft 7191.1Smycroft FADD.D COSB4,FP1 ...B4+S(B5+...) 7201.1Smycroft FADD.D SINA3,FP2 ...A3+S(A4+...) 7211.1Smycroft 7221.1Smycroft FMUL.X FP0,FP1 ...S(B4+...) 7231.1Smycroft FMUL.X FP0,FP2 ...S(A3+...) 7241.1Smycroft 7251.1Smycroft FADD.X COSB3,FP1 ...B3+S(B4+...) 7261.1Smycroft FADD.X SINA2,FP2 ...A2+S(A3+...) 7271.1Smycroft 7281.1Smycroft FMUL.X FP0,FP1 ...S(B3+...) 7291.1Smycroft FMUL.X FP0,FP2 ...S(A2+...) 7301.1Smycroft 7311.1Smycroft FADD.X COSB2,FP1 ...B2+S(B3+...) 7321.1Smycroft FADD.X SINA1,FP2 ...A1+S(A2+...) 7331.1Smycroft 7341.1Smycroft FMUL.X FP0,FP1 ...S(B2+...) 7351.1Smycroft fmul.x fp2,fp0 ...s(a1+...) 7361.1Smycroft 7371.1Smycroft 7381.1Smycroft 7391.1Smycroft FADD.S COSB1,FP1 ...B1+S(B2...) 7401.1Smycroft FMUL.X RPRIME(a6),FP0 ...R'S(A1+...) 7411.1Smycroft FMUL.X SPRIME(a6),FP1 ...S'(B1+S(B2+...)) 7421.1Smycroft 7431.1Smycroft move.l d1,-(sp) ;save users mode & precision 7441.1Smycroft andi.l #$ff,d1 ;mask off all exceptions 7451.1Smycroft fmove.l d1,FPCR 7461.1Smycroft FADD.S POSNEG1(a6),FP1 ...COS(X) 7471.1Smycroft bsr sto_cos ;store cosine result 7481.1Smycroft FMOVE.L (sp)+,FPCR ;restore users exceptions 7491.1Smycroft FADD.X RPRIME(a6),FP0 ...SIN(X) 7501.1Smycroft 7511.1Smycroft bra t_frcinx 7521.1Smycroft 7531.1SmycroftSCBORS: 7541.1Smycroft CMPI.L #$3FFF8000,D0 7551.1Smycroft BGT.W REDUCEX 7561.1Smycroft 7571.1Smycroft 7581.1SmycroftSCSM: 7591.2Smycroft CLR.W XDCARE(a6) 7601.1Smycroft FMOVE.S #:3F800000,FP1 7611.1Smycroft 7621.1Smycroft move.l d1,-(sp) ;save users mode & precision 7631.1Smycroft andi.l #$ff,d1 ;mask off all exceptions 7641.1Smycroft fmove.l d1,FPCR 7651.1Smycroft FSUB.S #:00800000,FP1 7661.1Smycroft bsr sto_cos ;store cosine result 7671.1Smycroft FMOVE.L (sp)+,FPCR ;restore users exceptions 7681.1Smycroft FMOVE.X X(a6),FP0 7691.1Smycroft bra t_frcinx 7701.1Smycroft 7711.1Smycroft end 772