m68k/fpsp/ssin.sa

1.6Smsaitoh*	$NetBSD: ssin.sa,v 1.6 2021/12/05 07:04:30 msaitoh Exp $
1.3Scgd
1.1Smycroft*	MOTOROLA MICROPROCESSOR & MEMORY TECHNOLOGY GROUP
1.1Smycroft*	M68000 Hi-Performance Microprocessor Division
1.1Smycroft*	M68040 Software Package
1.1Smycroft*
1.1Smycroft*	M68040 Software Package Copyright (c) 1993, 1994 Motorola Inc.
1.1Smycroft*	All rights reserved.
1.1Smycroft*
1.1Smycroft*	THE SOFTWARE is provided on an "AS IS" basis and without warranty.
1.1Smycroft*	To the maximum extent permitted by applicable law,
1.1Smycroft*	MOTOROLA DISCLAIMS ALL WARRANTIES WHETHER EXPRESS OR IMPLIED,
1.1Smycroft*	INCLUDING IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A
1.1Smycroft*	PARTICULAR PURPOSE and any warranty against infringement with
1.1Smycroft*	regard to the SOFTWARE (INCLUDING ANY MODIFIED VERSIONS THEREOF)
1.1Smycroft*	and any accompanying written materials.
1.1Smycroft*
1.1Smycroft*	To the maximum extent permitted by applicable law,
1.1Smycroft*	IN NO EVENT SHALL MOTOROLA BE LIABLE FOR ANY DAMAGES WHATSOEVER
1.1Smycroft*	(INCLUDING WITHOUT LIMITATION, DAMAGES FOR LOSS OF BUSINESS
1.1Smycroft*	PROFITS, BUSINESS INTERRUPTION, LOSS OF BUSINESS INFORMATION, OR
1.1Smycroft*	OTHER PECUNIARY LOSS) ARISING OF THE USE OR INABILITY TO USE THE
1.1Smycroft*	SOFTWARE.  Motorola assumes no responsibility for the maintenance
1.1Smycroft*	and support of the SOFTWARE.
1.1Smycroft*
1.1Smycroft*	You are hereby granted a copyright license to use, modify, and
1.1Smycroft*	distribute the SOFTWARE so long as this entire notice is retained
1.1Smycroft*	without alteration in any modified and/or redistributed versions,
1.1Smycroft*	and that such modified versions are clearly identified as such.
1.1Smycroft*	No licenses are granted by implication, estoppel or otherwise
1.1Smycroft*	under any patents or trademarks of Motorola, Inc.
1.1Smycroft
1.1Smycroft*
1.1Smycroft*	ssin.sa 3.3 7/29/91
1.1Smycroft*
1.1Smycroft*	The entry point sSIN computes the sine of an input argument
1.1Smycroft*	sCOS computes the cosine, and sSINCOS computes both. The
1.1Smycroft*	corresponding entry points with a "d" computes the same
1.1Smycroft*	corresponding function values for denormalized inputs.
1.1Smycroft*
1.1Smycroft*	Input: Double-extended number X in location pointed to
1.1Smycroft*		by address register a0.
1.1Smycroft*
1.5Sandvar*	Output: The function value sin(X) or cos(X) returned in Fp0 if SIN or
1.1Smycroft*		COS is requested. Otherwise, for SINCOS, sin(X) is returned
1.1Smycroft*		in Fp0, and cos(X) is returned in Fp1.
1.1Smycroft*
1.1Smycroft*	Modifies: Fp0 for SIN or COS; both Fp0 and Fp1 for SINCOS.
1.1Smycroft*
1.1Smycroft*	Accuracy and Monotonicity: The returned result is within 1 ulp in
1.1Smycroft*		64 significant bit, i.e. within 0.5001 ulp to 53 bits if the
1.1Smycroft*		result is subsequently rounded to double precision. The
1.1Smycroft*		result is provably monotonic in double precision.
1.1Smycroft*
1.1Smycroft*	Speed: The programs sSIN and sCOS take approximately 150 cycles for
1.4Ssoren*		input argument X such that |X| < 15Pi, which is the usual
1.1Smycroft*		situation. The speed for sSINCOS is approximately 190 cycles.
1.1Smycroft*
1.1Smycroft*	Algorithm:
1.1Smycroft*
1.1Smycroft*	SIN and COS:
1.1Smycroft*	1. If SIN is invoked, set AdjN := 0; otherwise, set AdjN := 1.
1.1Smycroft*
1.1Smycroft*	2. If |X| >= 15Pi or |X| < 2**(-40), go to 7.
1.1Smycroft*
1.1Smycroft*	3. Decompose X as X = N(Pi/2) + r where |r| <= Pi/4. Let
1.6Smsaitoh*		k = N mod 4, so in particular, k = 0,1,2,or 3. Overwrite
1.1Smycroft*		k by k := k + AdjN.
1.1Smycroft*
1.1Smycroft*	4. If k is even, go to 6.
1.1Smycroft*
1.1Smycroft*	5. (k is odd) Set j := (k-1)/2, sgn := (-1)**j. Return sgn*cos(r)
1.1Smycroft*		where cos(r) is approximated by an even polynomial in r,
1.1Smycroft*		1 + r*r*(B1+s*(B2+ ... + s*B8)),	s = r*r.
1.1Smycroft*		Exit.
1.1Smycroft*
1.1Smycroft*	6. (k is even) Set j := k/2, sgn := (-1)**j. Return sgn*sin(r)
1.1Smycroft*		where sin(r) is approximated by an odd polynomial in r
1.1Smycroft*		r + r*s*(A1+s*(A2+ ... + s*A7)),	s = r*r.
1.1Smycroft*		Exit.
1.1Smycroft*
1.1Smycroft*	7. If |X| > 1, go to 9.
1.1Smycroft*
1.1Smycroft*	8. (|X|<2**(-40)) If SIN is invoked, return X; otherwise return 1.
1.1Smycroft*
1.1Smycroft*	9. Overwrite X by X := X rem 2Pi. Now that |X| <= Pi, go back to 3.
1.1Smycroft*
1.1Smycroft*	SINCOS:
1.1Smycroft*	1. If |X| >= 15Pi or |X| < 2**(-40), go to 6.
1.1Smycroft*
1.1Smycroft*	2. Decompose X as X = N(Pi/2) + r where |r| <= Pi/4. Let
1.1Smycroft*		k = N mod 4, so in particular, k = 0,1,2,or 3.
1.1Smycroft*
1.1Smycroft*	3. If k is even, go to 5.
1.1Smycroft*
1.1Smycroft*	4. (k is odd) Set j1 := (k-1)/2, j2 := j1 (EOR) (k mod 2), i.e.
1.1Smycroft*		j1 exclusive or with the l.s.b. of k.
1.1Smycroft*		sgn1 := (-1)**j1, sgn2 := (-1)**j2.
1.1Smycroft*		SIN(X) = sgn1 * cos(r) and COS(X) = sgn2*sin(r) where
1.1Smycroft*		sin(r) and cos(r) are computed as odd and even polynomials
1.1Smycroft*		in r, respectively. Exit
1.1Smycroft*
1.1Smycroft*	5. (k is even) Set j1 := k/2, sgn1 := (-1)**j1.
1.1Smycroft*		SIN(X) = sgn1 * sin(r) and COS(X) = sgn1*cos(r) where
1.1Smycroft*		sin(r) and cos(r) are computed as odd and even polynomials
1.1Smycroft*		in r, respectively. Exit
1.1Smycroft*
1.1Smycroft*	6. If |X| > 1, go to 8.
1.1Smycroft*
1.1Smycroft*	7. (|X|<2**(-40)) SIN(X) = X and COS(X) = 1. Exit.
1.1Smycroft*
1.1Smycroft*	8. Overwrite X by X := X rem 2Pi. Now that |X| <= Pi, go back to 2.
1.1Smycroft*
1.1Smycroft
1.1SmycroftSSIN	IDNT	2,1 Motorola 040 Floating Point Software Package
1.1Smycroft
1.1Smycroft	section	8
1.1Smycroft
1.1Smycroft	include	fpsp.h
1.1Smycroft
1.1SmycroftBOUNDS1	DC.L $3FD78000,$4004BC7E
1.1SmycroftTWOBYPI	DC.L $3FE45F30,$6DC9C883
1.1Smycroft
1.1SmycroftSINA7	DC.L $BD6AAA77,$CCC994F5
1.1SmycroftSINA6	DC.L $3DE61209,$7AAE8DA1
1.1Smycroft
1.1SmycroftSINA5	DC.L $BE5AE645,$2A118AE4
1.1SmycroftSINA4	DC.L $3EC71DE3,$A5341531
1.1Smycroft
1.1SmycroftSINA3	DC.L $BF2A01A0,$1A018B59,$00000000,$00000000
1.1Smycroft
1.1SmycroftSINA2	DC.L $3FF80000,$88888888,$888859AF,$00000000
1.1Smycroft
1.1SmycroftSINA1	DC.L $BFFC0000,$AAAAAAAA,$AAAAAA99,$00000000
1.1Smycroft
1.1SmycroftCOSB8	DC.L $3D2AC4D0,$D6011EE3
1.1SmycroftCOSB7	DC.L $BDA9396F,$9F45AC19
1.1Smycroft
1.1SmycroftCOSB6	DC.L $3E21EED9,$0612C972
1.1SmycroftCOSB5	DC.L $BE927E4F,$B79D9FCF
1.1Smycroft
1.1SmycroftCOSB4	DC.L $3EFA01A0,$1A01D423,$00000000,$00000000
1.1Smycroft
1.1SmycroftCOSB3	DC.L $BFF50000,$B60B60B6,$0B61D438,$00000000
1.1Smycroft
1.1SmycroftCOSB2	DC.L $3FFA0000,$AAAAAAAA,$AAAAAB5E
1.1SmycroftCOSB1	DC.L $BF000000
1.1Smycroft
1.1SmycroftINVTWOPI DC.L $3FFC0000,$A2F9836E,$4E44152A
1.1Smycroft
1.1SmycroftTWOPI1	DC.L $40010000,$C90FDAA2,$00000000,$00000000
1.1SmycroftTWOPI2	DC.L $3FDF0000,$85A308D4,$00000000,$00000000
1.1Smycroft
1.1Smycroft	xref	PITBL
1.1Smycroft
1.1SmycroftINARG	equ	FP_SCR4
1.1Smycroft
1.1SmycroftX	equ	FP_SCR5
1.1SmycroftXDCARE	equ	X+2
1.1SmycroftXFRAC	equ	X+4
1.1Smycroft
1.1SmycroftRPRIME	equ	FP_SCR1
1.1SmycroftSPRIME	equ	FP_SCR2
1.1Smycroft
1.1SmycroftPOSNEG1	equ	L_SCR1
1.1SmycroftTWOTO63	equ	L_SCR1
1.1Smycroft
1.1SmycroftENDFLAG	equ	L_SCR2
1.1SmycroftN	equ	L_SCR2
1.1Smycroft
1.1SmycroftADJN	equ	L_SCR3
1.1Smycroft
1.1Smycroft	xref	t_frcinx
1.1Smycroft	xref	t_extdnrm
1.1Smycroft	xref	sto_cos
1.1Smycroft
1.1Smycroft	xdef	ssind
1.1Smycroftssind:
1.1Smycroft*--SIN(X) = X FOR DENORMALIZED X
1.1Smycroft	bra		t_extdnrm
1.1Smycroft
1.1Smycroft	xdef	scosd
1.1Smycroftscosd:
1.1Smycroft*--COS(X) = 1 FOR DENORMALIZED X
1.1Smycroft
1.1Smycroft	FMOVE.S		#:3F800000,FP0
1.1Smycroft*
1.1Smycroft*	9D25B Fix: Sometimes the previous fmove.s sets fpsr bits
1.1Smycroft*
1.1Smycroft	fmove.l		#0,fpsr
1.1Smycroft*
1.1Smycroft	bra		t_frcinx
1.1Smycroft
1.1Smycroft	xdef	ssin
1.1Smycroftssin:
1.1Smycroft*--SET ADJN TO 0
1.2Smycroft	CLR.L		ADJN(a6)
1.1Smycroft	BRA.B		SINBGN
1.1Smycroft
1.1Smycroft	xdef	scos
1.1Smycroftscos:
1.1Smycroft*--SET ADJN TO 1
1.1Smycroft	MOVE.L		#1,ADJN(a6)
1.1Smycroft
1.1SmycroftSINBGN:
1.1Smycroft*--SAVE FPCR, FP1. CHECK IF |X| IS TOO SMALL OR LARGE
1.1Smycroft
1.1Smycroft	FMOVE.X		(a0),FP0	...LOAD INPUT
1.1Smycroft
1.1Smycroft	MOVE.L		(A0),D0
1.1Smycroft	MOVE.W		4(A0),D0
1.1Smycroft	FMOVE.X		FP0,X(a6)
1.1Smycroft	ANDI.L		#$7FFFFFFF,D0		...COMPACTIFY X
1.1Smycroft
1.1Smycroft	CMPI.L		#$3FD78000,D0		...|X| >= 2**(-40)?
1.1Smycroft	BGE.B		SOK1
1.1Smycroft	BRA.W		SINSM
1.1Smycroft
1.1SmycroftSOK1:
1.1Smycroft	CMPI.L		#$4004BC7E,D0		...|X| < 15 PI?
1.1Smycroft	BLT.B		SINMAIN
1.1Smycroft	BRA.W		REDUCEX
1.1Smycroft
1.1SmycroftSINMAIN:
1.1Smycroft*--THIS IS THE USUAL CASE, |X| <= 15 PI.
1.1Smycroft*--THE ARGUMENT REDUCTION IS DONE BY TABLE LOOK UP.
1.1Smycroft	FMOVE.X		FP0,FP1
1.1Smycroft	FMUL.D		TWOBYPI,FP1	...X*2/PI
1.1Smycroft
1.1Smycroft*--HIDE THE NEXT THREE INSTRUCTIONS
1.1Smycroft	LEA		PITBL+$200,A1 ...TABLE OF N*PI/2, N = -32,...,32
1.1Smycroft
1.1Smycroft
1.1Smycroft*--FP1 IS NOW READY
1.1Smycroft	FMOVE.L		FP1,N(a6)		...CONVERT TO INTEGER
1.1Smycroft
1.1Smycroft	MOVE.L		N(a6),D0
1.1Smycroft	ASL.L		#4,D0
1.1Smycroft	ADDA.L		D0,A1	...A1 IS THE ADDRESS OF N*PIBY2
1.1Smycroft*				...WHICH IS IN TWO PIECES Y1 & Y2
1.1Smycroft
1.1Smycroft	FSUB.X		(A1)+,FP0	...X-Y1
1.1Smycroft*--HIDE THE NEXT ONE
1.1Smycroft	FSUB.S		(A1),FP0	...FP0 IS R = (X-Y1)-Y2
1.1Smycroft
1.1SmycroftSINCONT:
1.1Smycroft*--continuation from REDUCEX
1.1Smycroft
1.1Smycroft*--GET N+ADJN AND SEE IF SIN(R) OR COS(R) IS NEEDED
1.1Smycroft	MOVE.L		N(a6),D0
1.1Smycroft	ADD.L		ADJN(a6),D0	...SEE IF D0 IS ODD OR EVEN
1.1Smycroft	ROR.L		#1,D0	...D0 WAS ODD IFF D0 IS NEGATIVE
1.2Smycroft	TST.L		D0
1.1Smycroft	BLT.W		COSPOLY
1.1Smycroft
1.1SmycroftSINPOLY:
1.1Smycroft*--LET J BE THE LEAST SIG. BIT OF D0, LET SGN := (-1)**J.
1.1Smycroft*--THEN WE RETURN	SGN*SIN(R). SGN*SIN(R) IS COMPUTED BY
1.1Smycroft*--R' + R'*S*(A1 + S(A2 + S(A3 + S(A4 + ... + SA7)))), WHERE
1.1Smycroft*--R' = SGN*R, S=R*R. THIS CAN BE REWRITTEN AS
1.1Smycroft*--R' + R'*S*( [A1+T(A3+T(A5+TA7))] + [S(A2+T(A4+TA6))])
1.1Smycroft*--WHERE T=S*S.
1.1Smycroft*--NOTE THAT A3 THROUGH A7 ARE STORED IN DOUBLE PRECISION
1.1Smycroft*--WHILE A1 AND A2 ARE IN DOUBLE-EXTENDED FORMAT.
1.1Smycroft	FMOVE.X		FP0,X(a6)	...X IS R
1.1Smycroft	FMUL.X		FP0,FP0	...FP0 IS S
1.1Smycroft*---HIDE THE NEXT TWO WHILE WAITING FOR FP0
1.1Smycroft	FMOVE.D		SINA7,FP3
1.1Smycroft	FMOVE.D		SINA6,FP2
1.1Smycroft*--FP0 IS NOW READY
1.1Smycroft	FMOVE.X		FP0,FP1
1.1Smycroft	FMUL.X		FP1,FP1	...FP1 IS T
1.1Smycroft*--HIDE THE NEXT TWO WHILE WAITING FOR FP1
1.1Smycroft
1.1Smycroft	ROR.L		#1,D0
1.1Smycroft	ANDI.L		#$80000000,D0
1.1Smycroft*				...LEAST SIG. BIT OF D0 IN SIGN POSITION
1.1Smycroft	EOR.L		D0,X(a6)	...X IS NOW R'= SGN*R
1.1Smycroft
1.1Smycroft	FMUL.X		FP1,FP3	...TA7
1.1Smycroft	FMUL.X		FP1,FP2	...TA6
1.1Smycroft
1.1Smycroft	FADD.D		SINA5,FP3 ...A5+TA7
1.1Smycroft	FADD.D		SINA4,FP2 ...A4+TA6
1.1Smycroft
1.1Smycroft	FMUL.X		FP1,FP3	...T(A5+TA7)
1.1Smycroft	FMUL.X		FP1,FP2	...T(A4+TA6)
1.1Smycroft
1.1Smycroft	FADD.D		SINA3,FP3 ...A3+T(A5+TA7)
1.1Smycroft	FADD.X		SINA2,FP2 ...A2+T(A4+TA6)
1.1Smycroft
1.1Smycroft	FMUL.X		FP3,FP1	...T(A3+T(A5+TA7))
1.1Smycroft
1.1Smycroft	FMUL.X		FP0,FP2	...S(A2+T(A4+TA6))
1.1Smycroft	FADD.X		SINA1,FP1 ...A1+T(A3+T(A5+TA7))
1.1Smycroft	FMUL.X		X(a6),FP0	...R'*S
1.1Smycroft
1.1Smycroft	FADD.X		FP2,FP1	...[A1+T(A3+T(A5+TA7))]+[S(A2+T(A4+TA6))]
1.1Smycroft*--FP3 RELEASED, RESTORE NOW AND TAKE SOME ADVANTAGE OF HIDING
1.1Smycroft*--FP2 RELEASED, RESTORE NOW AND TAKE FULL ADVANTAGE OF HIDING
1.1Smycroft
1.1Smycroft
1.1Smycroft	FMUL.X		FP1,FP0		...SIN(R')-R'
1.1Smycroft*--FP1 RELEASED.
1.1Smycroft
1.1Smycroft	FMOVE.L		d1,FPCR		;restore users exceptions
1.1Smycroft	FADD.X		X(a6),FP0		;last inst - possible exception set
1.1Smycroft	bra		t_frcinx
1.1Smycroft
1.1Smycroft
1.1SmycroftCOSPOLY:
1.1Smycroft*--LET J BE THE LEAST SIG. BIT OF D0, LET SGN := (-1)**J.
1.1Smycroft*--THEN WE RETURN	SGN*COS(R). SGN*COS(R) IS COMPUTED BY
1.1Smycroft*--SGN + S'*(B1 + S(B2 + S(B3 + S(B4 + ... + SB8)))), WHERE
1.1Smycroft*--S=R*R AND S'=SGN*S. THIS CAN BE REWRITTEN AS
1.1Smycroft*--SGN + S'*([B1+T(B3+T(B5+TB7))] + [S(B2+T(B4+T(B6+TB8)))])
1.1Smycroft*--WHERE T=S*S.
1.1Smycroft*--NOTE THAT B4 THROUGH B8 ARE STORED IN DOUBLE PRECISION
1.1Smycroft*--WHILE B2 AND B3 ARE IN DOUBLE-EXTENDED FORMAT, B1 IS -1/2
1.1Smycroft*--AND IS THEREFORE STORED AS SINGLE PRECISION.
1.1Smycroft
1.1Smycroft	FMUL.X		FP0,FP0	...FP0 IS S
1.1Smycroft*---HIDE THE NEXT TWO WHILE WAITING FOR FP0
1.1Smycroft	FMOVE.D		COSB8,FP2
1.1Smycroft	FMOVE.D		COSB7,FP3
1.1Smycroft*--FP0 IS NOW READY
1.1Smycroft	FMOVE.X		FP0,FP1
1.1Smycroft	FMUL.X		FP1,FP1	...FP1 IS T
1.1Smycroft*--HIDE THE NEXT TWO WHILE WAITING FOR FP1
1.1Smycroft	FMOVE.X		FP0,X(a6)	...X IS S
1.1Smycroft	ROR.L		#1,D0
1.1Smycroft	ANDI.L		#$80000000,D0
1.1Smycroft*			...LEAST SIG. BIT OF D0 IN SIGN POSITION
1.1Smycroft
1.1Smycroft	FMUL.X		FP1,FP2	...TB8
1.1Smycroft*--HIDE THE NEXT TWO WHILE WAITING FOR THE XU
1.1Smycroft	EOR.L		D0,X(a6)	...X IS NOW S'= SGN*S
1.1Smycroft	ANDI.L		#$80000000,D0
1.1Smycroft
1.1Smycroft	FMUL.X		FP1,FP3	...TB7
1.1Smycroft*--HIDE THE NEXT TWO WHILE WAITING FOR THE XU
1.1Smycroft	ORI.L		#$3F800000,D0	...D0 IS SGN IN SINGLE
1.1Smycroft	MOVE.L		D0,POSNEG1(a6)
1.1Smycroft
1.1Smycroft	FADD.D		COSB6,FP2 ...B6+TB8
1.1Smycroft	FADD.D		COSB5,FP3 ...B5+TB7
1.1Smycroft
1.1Smycroft	FMUL.X		FP1,FP2	...T(B6+TB8)
1.1Smycroft	FMUL.X		FP1,FP3	...T(B5+TB7)
1.1Smycroft
1.1Smycroft	FADD.D		COSB4,FP2 ...B4+T(B6+TB8)
1.1Smycroft	FADD.X		COSB3,FP3 ...B3+T(B5+TB7)
1.1Smycroft
1.1Smycroft	FMUL.X		FP1,FP2	...T(B4+T(B6+TB8))
1.1Smycroft	FMUL.X		FP3,FP1	...T(B3+T(B5+TB7))
1.1Smycroft
1.1Smycroft	FADD.X		COSB2,FP2 ...B2+T(B4+T(B6+TB8))
1.1Smycroft	FADD.S		COSB1,FP1 ...B1+T(B3+T(B5+TB7))
1.1Smycroft
1.1Smycroft	FMUL.X		FP2,FP0	...S(B2+T(B4+T(B6+TB8)))
1.1Smycroft*--FP3 RELEASED, RESTORE NOW AND TAKE SOME ADVANTAGE OF HIDING
1.1Smycroft*--FP2 RELEASED.
1.1Smycroft
1.1Smycroft
1.1Smycroft	FADD.X		FP1,FP0
1.1Smycroft*--FP1 RELEASED
1.1Smycroft
1.1Smycroft	FMUL.X		X(a6),FP0
1.1Smycroft
1.1Smycroft	FMOVE.L		d1,FPCR		;restore users exceptions
1.1Smycroft	FADD.S		POSNEG1(a6),FP0	;last inst - possible exception set
1.1Smycroft	bra		t_frcinx
1.1Smycroft
1.1Smycroft
1.1SmycroftSINBORS:
1.1Smycroft*--IF |X| > 15PI, WE USE THE GENERAL ARGUMENT REDUCTION.
1.1Smycroft*--IF |X| < 2**(-40), RETURN X OR 1.
1.1Smycroft	CMPI.L		#$3FFF8000,D0
1.1Smycroft	BGT.B		REDUCEX
1.1Smycroft
1.1Smycroft
1.1SmycroftSINSM:
1.1Smycroft	MOVE.L		ADJN(a6),D0
1.2Smycroft	TST.L		D0
1.1Smycroft	BGT.B		COSTINY
1.1Smycroft
1.1SmycroftSINTINY:
1.2Smycroft	CLR.W		XDCARE(a6)	...JUST IN CASE
1.1Smycroft	FMOVE.L		d1,FPCR		;restore users exceptions
1.1Smycroft	FMOVE.X		X(a6),FP0		;last inst - possible exception set
1.1Smycroft	bra		t_frcinx
1.1Smycroft
1.1Smycroft
1.1SmycroftCOSTINY:
1.1Smycroft	FMOVE.S		#:3F800000,FP0
1.1Smycroft
1.1Smycroft	FMOVE.L		d1,FPCR		;restore users exceptions
1.1Smycroft	FSUB.S		#:00800000,FP0	;last inst - possible exception set
1.1Smycroft	bra		t_frcinx
1.1Smycroft
1.1Smycroft
1.1SmycroftREDUCEX:
1.1Smycroft*--WHEN REDUCEX IS USED, THE CODE WILL INEVITABLY BE SLOW.
1.1Smycroft*--THIS REDUCTION METHOD, HOWEVER, IS MUCH FASTER THAN USING
1.1Smycroft*--THE REMAINDER INSTRUCTION WHICH IS NOW IN SOFTWARE.
1.1Smycroft
1.1Smycroft	FMOVEM.X	FP2-FP5,-(A7)	...save FP2 through FP5
1.1Smycroft	MOVE.L		D2,-(A7)
1.1Smycroft        FMOVE.S         #:00000000,FP1
1.1Smycroft*--If compact form of abs(arg) in d0=$7ffeffff, argument is so large that
1.1Smycroft*--there is a danger of unwanted overflow in first LOOP iteration.  In this
1.1Smycroft*--case, reduce argument by one remainder step to make subsequent reduction
1.1Smycroft*--safe.
1.1Smycroft	cmpi.l	#$7ffeffff,d0		;is argument dangerously large?
1.1Smycroft	bne.b	LOOP
1.1Smycroft	move.l	#$7ffe0000,FP_SCR2(a6)	;yes
1.1Smycroft*					;create 2**16383*PI/2
1.1Smycroft	move.l	#$c90fdaa2,FP_SCR2+4(a6)
1.1Smycroft	clr.l	FP_SCR2+8(a6)
1.1Smycroft	ftst.x	fp0			;test sign of argument
1.1Smycroft	move.l	#$7fdc0000,FP_SCR3(a6)	;create low half of 2**16383*
1.1Smycroft*					;PI/2 at FP_SCR3
1.1Smycroft	move.l	#$85a308d3,FP_SCR3+4(a6)
1.1Smycroft	clr.l   FP_SCR3+8(a6)
1.1Smycroft	fblt.w	red_neg
1.1Smycroft	or.w	#$8000,FP_SCR2(a6)	;positive arg
1.1Smycroft	or.w	#$8000,FP_SCR3(a6)
1.1Smycroftred_neg:
1.1Smycroft	fadd.x  FP_SCR2(a6),fp0		;high part of reduction is exact
1.1Smycroft	fmove.x  fp0,fp1		;save high result in fp1
1.1Smycroft	fadd.x  FP_SCR3(a6),fp0		;low part of reduction
1.1Smycroft	fsub.x  fp0,fp1			;determine low component of result
1.1Smycroft	fadd.x  FP_SCR3(a6),fp1		;fp0/fp1 are reduced argument.
1.1Smycroft
1.1Smycroft*--ON ENTRY, FP0 IS X, ON RETURN, FP0 IS X REM PI/2, |X| <= PI/4.
1.1Smycroft*--integer quotient will be stored in N
1.1Smycroft*--Intermeditate remainder is 66-bit long; (R,r) in (FP0,FP1)
1.1Smycroft
1.1SmycroftLOOP:
1.1Smycroft	FMOVE.X		FP0,INARG(a6)	...+-2**K * F, 1 <= F < 2
1.1Smycroft	MOVE.W		INARG(a6),D0
1.1Smycroft        MOVE.L          D0,A1		...save a copy of D0
1.1Smycroft	ANDI.L		#$00007FFF,D0
1.1Smycroft	SUBI.L		#$00003FFF,D0	...D0 IS K
1.1Smycroft	CMPI.L		#28,D0
1.1Smycroft	BLE.B		LASTLOOP
1.1SmycroftCONTLOOP:
1.1Smycroft	SUBI.L		#27,D0	 ...D0 IS L := K-27
1.2Smycroft	CLR.L		ENDFLAG(a6)
1.1Smycroft	BRA.B		WORK
1.1SmycroftLASTLOOP:
1.1Smycroft	CLR.L		D0		...D0 IS L := 0
1.1Smycroft	MOVE.L		#1,ENDFLAG(a6)
1.1Smycroft
1.1SmycroftWORK:
1.1Smycroft*--FIND THE REMAINDER OF (R,r) W.R.T.	2**L * (PI/2). L IS SO CHOSEN
1.1Smycroft*--THAT	INT( X * (2/PI) / 2**(L) ) < 2**29.
1.1Smycroft
1.1Smycroft*--CREATE 2**(-L) * (2/PI), SIGN(INARG)*2**(63),
1.1Smycroft*--2**L * (PIby2_1), 2**L * (PIby2_2)
1.1Smycroft
1.1Smycroft	MOVE.L		#$00003FFE,D2	...BIASED EXPO OF 2/PI
1.1Smycroft	SUB.L		D0,D2		...BIASED EXPO OF 2**(-L)*(2/PI)
1.1Smycroft
1.1Smycroft	MOVE.L		#$A2F9836E,FP_SCR1+4(a6)
1.1Smycroft	MOVE.L		#$4E44152A,FP_SCR1+8(a6)
1.1Smycroft	MOVE.W		D2,FP_SCR1(a6)	...FP_SCR1 is 2**(-L)*(2/PI)
1.1Smycroft
1.1Smycroft	FMOVE.X		FP0,FP2
1.1Smycroft	FMUL.X		FP_SCR1(a6),FP2
1.1Smycroft*--WE MUST NOW FIND INT(FP2). SINCE WE NEED THIS VALUE IN
1.1Smycroft*--FLOATING POINT FORMAT, THE TWO FMOVE'S	FMOVE.L FP <--> N
1.1Smycroft*--WILL BE TOO INEFFICIENT. THE WAY AROUND IT IS THAT
1.1Smycroft*--(SIGN(INARG)*2**63	+	FP2) - SIGN(INARG)*2**63 WILL GIVE
1.1Smycroft*--US THE DESIRED VALUE IN FLOATING POINT.
1.1Smycroft
1.1Smycroft*--HIDE SIX CYCLES OF INSTRUCTION
1.1Smycroft        MOVE.L		A1,D2
1.1Smycroft        SWAP		D2
1.1Smycroft	ANDI.L		#$80000000,D2
1.1Smycroft	ORI.L		#$5F000000,D2	...D2 IS SIGN(INARG)*2**63 IN SGL
1.1Smycroft	MOVE.L		D2,TWOTO63(a6)
1.1Smycroft
1.1Smycroft	MOVE.L		D0,D2
1.1Smycroft	ADDI.L		#$00003FFF,D2	...BIASED EXPO OF 2**L * (PI/2)
1.1Smycroft
1.1Smycroft*--FP2 IS READY
1.1Smycroft	FADD.S		TWOTO63(a6),FP2	...THE FRACTIONAL PART OF FP1 IS ROUNDED
1.1Smycroft
1.1Smycroft*--HIDE 4 CYCLES OF INSTRUCTION; creating 2**(L)*Piby2_1  and  2**(L)*Piby2_2
1.1Smycroft        MOVE.W		D2,FP_SCR2(a6)
1.1Smycroft	CLR.W           FP_SCR2+2(a6)
1.1Smycroft	MOVE.L		#$C90FDAA2,FP_SCR2+4(a6)
1.1Smycroft	CLR.L		FP_SCR2+8(a6)		...FP_SCR2 is  2**(L) * Piby2_1
1.1Smycroft
1.1Smycroft*--FP2 IS READY
1.1Smycroft	FSUB.S		TWOTO63(a6),FP2		...FP2 is N
1.1Smycroft
1.1Smycroft	ADDI.L		#$00003FDD,D0
1.1Smycroft        MOVE.W		D0,FP_SCR3(a6)
1.1Smycroft	CLR.W           FP_SCR3+2(a6)
1.1Smycroft	MOVE.L		#$85A308D3,FP_SCR3+4(a6)
1.1Smycroft	CLR.L		FP_SCR3+8(a6)		...FP_SCR3 is 2**(L) * Piby2_2
1.1Smycroft
1.1Smycroft	MOVE.L		ENDFLAG(a6),D0
1.1Smycroft
1.1Smycroft*--We are now ready to perform (R+r) - N*P1 - N*P2, P1 = 2**(L) * Piby2_1 and
1.1Smycroft*--P2 = 2**(L) * Piby2_2
1.1Smycroft	FMOVE.X		FP2,FP4
1.1Smycroft	FMul.X		FP_SCR2(a6),FP4		...W = N*P1
1.1Smycroft	FMove.X		FP2,FP5
1.1Smycroft	FMul.X		FP_SCR3(a6),FP5		...w = N*P2
1.1Smycroft	FMove.X		FP4,FP3
1.1Smycroft*--we want P+p = W+w  but  |p| <= half ulp of P
1.1Smycroft*--Then, we need to compute  A := R-P   and  a := r-p
1.1Smycroft	FAdd.X		FP5,FP3			...FP3 is P
1.1Smycroft	FSub.X		FP3,FP4			...W-P
1.1Smycroft
1.1Smycroft	FSub.X		FP3,FP0			...FP0 is A := R - P
1.1Smycroft        FAdd.X		FP5,FP4			...FP4 is p = (W-P)+w
1.1Smycroft
1.1Smycroft	FMove.X		FP0,FP3			...FP3 A
1.1Smycroft	FSub.X		FP4,FP1			...FP1 is a := r - p
1.1Smycroft
1.1Smycroft*--Now we need to normalize (A,a) to  "new (R,r)" where R+r = A+a but
1.1Smycroft*--|r| <= half ulp of R.
1.1Smycroft	FAdd.X		FP1,FP0			...FP0 is R := A+a
1.1Smycroft*--No need to calculate r if this is the last loop
1.2Smycroft	TST.L		D0
1.1Smycroft	BGT.W		RESTORE
1.1Smycroft
1.1Smycroft*--Need to calculate r
1.1Smycroft	FSub.X		FP0,FP3			...A-R
1.1Smycroft	FAdd.X		FP3,FP1			...FP1 is r := (A-R)+a
1.1Smycroft	BRA.W		LOOP
1.1Smycroft
1.1SmycroftRESTORE:
1.1Smycroft        FMOVE.L		FP2,N(a6)
1.1Smycroft	MOVE.L		(A7)+,D2
1.1Smycroft	FMOVEM.X	(A7)+,FP2-FP5
1.1Smycroft
1.1Smycroft
1.1Smycroft	MOVE.L		ADJN(a6),D0
1.1Smycroft	CMPI.L		#4,D0
1.1Smycroft
1.1Smycroft	BLT.W		SINCONT
1.1Smycroft	BRA.B		SCCONT
1.1Smycroft
1.1Smycroft	xdef	ssincosd
1.1Smycroftssincosd:
1.1Smycroft*--SIN AND COS OF X FOR DENORMALIZED X
1.1Smycroft
1.1Smycroft	FMOVE.S		#:3F800000,FP1
1.1Smycroft	bsr		sto_cos		;store cosine result
1.1Smycroft	bra		t_extdnrm
1.1Smycroft
1.1Smycroft	xdef	ssincos
1.1Smycroftssincos:
1.1Smycroft*--SET ADJN TO 4
1.1Smycroft	MOVE.L		#4,ADJN(a6)
1.1Smycroft
1.1Smycroft	FMOVE.X		(a0),FP0	...LOAD INPUT
1.1Smycroft
1.1Smycroft	MOVE.L		(A0),D0
1.1Smycroft	MOVE.W		4(A0),D0
1.1Smycroft	FMOVE.X		FP0,X(a6)
1.1Smycroft	ANDI.L		#$7FFFFFFF,D0		...COMPACTIFY X
1.1Smycroft
1.1Smycroft	CMPI.L		#$3FD78000,D0		...|X| >= 2**(-40)?
1.1Smycroft	BGE.B		SCOK1
1.1Smycroft	BRA.W		SCSM
1.1Smycroft
1.1SmycroftSCOK1:
1.1Smycroft	CMPI.L		#$4004BC7E,D0		...|X| < 15 PI?
1.1Smycroft	BLT.B		SCMAIN
1.1Smycroft	BRA.W		REDUCEX
1.1Smycroft
1.1Smycroft
1.1SmycroftSCMAIN:
1.1Smycroft*--THIS IS THE USUAL CASE, |X| <= 15 PI.
1.1Smycroft*--THE ARGUMENT REDUCTION IS DONE BY TABLE LOOK UP.
1.1Smycroft	FMOVE.X		FP0,FP1
1.1Smycroft	FMUL.D		TWOBYPI,FP1	...X*2/PI
1.1Smycroft
1.1Smycroft*--HIDE THE NEXT THREE INSTRUCTIONS
1.1Smycroft	LEA		PITBL+$200,A1 ...TABLE OF N*PI/2, N = -32,...,32
1.1Smycroft
1.1Smycroft
1.1Smycroft*--FP1 IS NOW READY
1.1Smycroft	FMOVE.L		FP1,N(a6)		...CONVERT TO INTEGER
1.1Smycroft
1.1Smycroft	MOVE.L		N(a6),D0
1.1Smycroft	ASL.L		#4,D0
1.1Smycroft	ADDA.L		D0,A1		...ADDRESS OF N*PIBY2, IN Y1, Y2
1.1Smycroft
1.1Smycroft	FSUB.X		(A1)+,FP0	...X-Y1
1.1Smycroft        FSUB.S		(A1),FP0	...FP0 IS R = (X-Y1)-Y2
1.1Smycroft
1.1SmycroftSCCONT:
1.1Smycroft*--continuation point from REDUCEX
1.1Smycroft
1.1Smycroft*--HIDE THE NEXT TWO
1.1Smycroft	MOVE.L		N(a6),D0
1.1Smycroft	ROR.L		#1,D0
1.1Smycroft
1.2Smycroft	TST.L		D0		...D0 < 0 IFF N IS ODD
1.1Smycroft	BGE.W		NEVEN
1.1Smycroft
1.1SmycroftNODD:
1.1Smycroft*--REGISTERS SAVED SO FAR: D0, A0, FP2.
1.1Smycroft
1.1Smycroft	FMOVE.X		FP0,RPRIME(a6)
1.1Smycroft	FMUL.X		FP0,FP0	 ...FP0 IS S = R*R
1.1Smycroft	FMOVE.D		SINA7,FP1	...A7
1.1Smycroft	FMOVE.D		COSB8,FP2	...B8
1.1Smycroft	FMUL.X		FP0,FP1	 ...SA7
1.1Smycroft	MOVE.L		d2,-(A7)
1.1Smycroft	MOVE.L		D0,d2
1.1Smycroft	FMUL.X		FP0,FP2	 ...SB8
1.1Smycroft	ROR.L		#1,d2
1.1Smycroft	ANDI.L		#$80000000,d2
1.1Smycroft
1.1Smycroft	FADD.D		SINA6,FP1	...A6+SA7
1.1Smycroft	EOR.L		D0,d2
1.1Smycroft	ANDI.L		#$80000000,d2
1.1Smycroft	FADD.D		COSB7,FP2	...B7+SB8
1.1Smycroft
1.1Smycroft	FMUL.X		FP0,FP1	 ...S(A6+SA7)
1.1Smycroft	EOR.L		d2,RPRIME(a6)
1.1Smycroft	MOVE.L		(A7)+,d2
1.1Smycroft	FMUL.X		FP0,FP2	 ...S(B7+SB8)
1.1Smycroft	ROR.L		#1,D0
1.1Smycroft	ANDI.L		#$80000000,D0
1.1Smycroft
1.1Smycroft	FADD.D		SINA5,FP1	...A5+S(A6+SA7)
1.1Smycroft	MOVE.L		#$3F800000,POSNEG1(a6)
1.1Smycroft	EOR.L		D0,POSNEG1(a6)
1.1Smycroft	FADD.D		COSB6,FP2	...B6+S(B7+SB8)
1.1Smycroft
1.1Smycroft	FMUL.X		FP0,FP1	 ...S(A5+S(A6+SA7))
1.1Smycroft	FMUL.X		FP0,FP2	 ...S(B6+S(B7+SB8))
1.1Smycroft	FMOVE.X		FP0,SPRIME(a6)
1.1Smycroft
1.1Smycroft	FADD.D		SINA4,FP1	...A4+S(A5+S(A6+SA7))
1.1Smycroft	EOR.L		D0,SPRIME(a6)
1.1Smycroft	FADD.D		COSB5,FP2	...B5+S(B6+S(B7+SB8))
1.1Smycroft
1.1Smycroft	FMUL.X		FP0,FP1	 ...S(A4+...)
1.1Smycroft	FMUL.X		FP0,FP2	 ...S(B5+...)
1.1Smycroft
1.1Smycroft	FADD.D		SINA3,FP1	...A3+S(A4+...)
1.1Smycroft	FADD.D		COSB4,FP2	...B4+S(B5+...)
1.1Smycroft
1.1Smycroft	FMUL.X		FP0,FP1	 ...S(A3+...)
1.1Smycroft	FMUL.X		FP0,FP2	 ...S(B4+...)
1.1Smycroft
1.1Smycroft	FADD.X		SINA2,FP1	...A2+S(A3+...)
1.1Smycroft	FADD.X		COSB3,FP2	...B3+S(B4+...)
1.1Smycroft
1.1Smycroft	FMUL.X		FP0,FP1	 ...S(A2+...)
1.1Smycroft	FMUL.X		FP0,FP2	 ...S(B3+...)
1.1Smycroft
1.1Smycroft	FADD.X		SINA1,FP1	...A1+S(A2+...)
1.1Smycroft	FADD.X		COSB2,FP2	...B2+S(B3+...)
1.1Smycroft
1.1Smycroft	FMUL.X		FP0,FP1	 ...S(A1+...)
1.1Smycroft	FMUL.X		FP2,FP0	 ...S(B2+...)
1.1Smycroft
1.1Smycroft
1.1Smycroft
1.1Smycroft	FMUL.X		RPRIME(a6),FP1	...R'S(A1+...)
1.1Smycroft	FADD.S		COSB1,FP0	...B1+S(B2...)
1.1Smycroft	FMUL.X		SPRIME(a6),FP0	...S'(B1+S(B2+...))
1.1Smycroft
1.1Smycroft	move.l		d1,-(sp)	;restore users mode & precision
1.1Smycroft	andi.l		#$ff,d1		;mask off all exceptions
1.1Smycroft	fmove.l		d1,FPCR
1.1Smycroft	FADD.X		RPRIME(a6),FP1	...COS(X)
1.1Smycroft	bsr		sto_cos		;store cosine result
1.1Smycroft	FMOVE.L		(sp)+,FPCR	;restore users exceptions
1.1Smycroft	FADD.S		POSNEG1(a6),FP0	...SIN(X)
1.1Smycroft
1.1Smycroft	bra		t_frcinx
1.1Smycroft
1.1Smycroft
1.1SmycroftNEVEN:
1.1Smycroft*--REGISTERS SAVED SO FAR: FP2.
1.1Smycroft
1.1Smycroft	FMOVE.X		FP0,RPRIME(a6)
1.1Smycroft	FMUL.X		FP0,FP0	 ...FP0 IS S = R*R
1.1Smycroft	FMOVE.D		COSB8,FP1			...B8
1.1Smycroft	FMOVE.D		SINA7,FP2			...A7
1.1Smycroft	FMUL.X		FP0,FP1	 ...SB8
1.1Smycroft	FMOVE.X		FP0,SPRIME(a6)
1.1Smycroft	FMUL.X		FP0,FP2	 ...SA7
1.1Smycroft	ROR.L		#1,D0
1.1Smycroft	ANDI.L		#$80000000,D0
1.1Smycroft	FADD.D		COSB7,FP1	...B7+SB8
1.1Smycroft	FADD.D		SINA6,FP2	...A6+SA7
1.1Smycroft	EOR.L		D0,RPRIME(a6)
1.1Smycroft	EOR.L		D0,SPRIME(a6)
1.1Smycroft	FMUL.X		FP0,FP1	 ...S(B7+SB8)
1.1Smycroft	ORI.L		#$3F800000,D0
1.1Smycroft	MOVE.L		D0,POSNEG1(a6)
1.1Smycroft	FMUL.X		FP0,FP2	 ...S(A6+SA7)
1.1Smycroft
1.1Smycroft	FADD.D		COSB6,FP1	...B6+S(B7+SB8)
1.1Smycroft	FADD.D		SINA5,FP2	...A5+S(A6+SA7)
1.1Smycroft
1.1Smycroft	FMUL.X		FP0,FP1	 ...S(B6+S(B7+SB8))
1.1Smycroft	FMUL.X		FP0,FP2	 ...S(A5+S(A6+SA7))
1.1Smycroft
1.1Smycroft	FADD.D		COSB5,FP1	...B5+S(B6+S(B7+SB8))
1.1Smycroft	FADD.D		SINA4,FP2	...A4+S(A5+S(A6+SA7))
1.1Smycroft
1.1Smycroft	FMUL.X		FP0,FP1	 ...S(B5+...)
1.1Smycroft	FMUL.X		FP0,FP2	 ...S(A4+...)
1.1Smycroft
1.1Smycroft	FADD.D		COSB4,FP1	...B4+S(B5+...)
1.1Smycroft	FADD.D		SINA3,FP2	...A3+S(A4+...)
1.1Smycroft
1.1Smycroft	FMUL.X		FP0,FP1	 ...S(B4+...)
1.1Smycroft	FMUL.X		FP0,FP2	 ...S(A3+...)
1.1Smycroft
1.1Smycroft	FADD.X		COSB3,FP1	...B3+S(B4+...)
1.1Smycroft	FADD.X		SINA2,FP2	...A2+S(A3+...)
1.1Smycroft
1.1Smycroft	FMUL.X		FP0,FP1	 ...S(B3+...)
1.1Smycroft	FMUL.X		FP0,FP2	 ...S(A2+...)
1.1Smycroft
1.1Smycroft	FADD.X		COSB2,FP1	...B2+S(B3+...)
1.1Smycroft	FADD.X		SINA1,FP2	...A1+S(A2+...)
1.1Smycroft
1.1Smycroft	FMUL.X		FP0,FP1	 ...S(B2+...)
1.1Smycroft	fmul.x		fp2,fp0	 ...s(a1+...)
1.1Smycroft
1.1Smycroft
1.1Smycroft
1.1Smycroft	FADD.S		COSB1,FP1	...B1+S(B2...)
1.1Smycroft	FMUL.X		RPRIME(a6),FP0	...R'S(A1+...)
1.1Smycroft	FMUL.X		SPRIME(a6),FP1	...S'(B1+S(B2+...))
1.1Smycroft
1.1Smycroft	move.l		d1,-(sp)	;save users mode & precision
1.1Smycroft	andi.l		#$ff,d1		;mask off all exceptions
1.1Smycroft	fmove.l		d1,FPCR
1.1Smycroft	FADD.S		POSNEG1(a6),FP1	...COS(X)
1.1Smycroft	bsr		sto_cos		;store cosine result
1.1Smycroft	FMOVE.L		(sp)+,FPCR	;restore users exceptions
1.1Smycroft	FADD.X		RPRIME(a6),FP0	...SIN(X)
1.1Smycroft
1.1Smycroft	bra		t_frcinx
1.1Smycroft
1.1SmycroftSCBORS:
1.1Smycroft	CMPI.L		#$3FFF8000,D0
1.1Smycroft	BGT.W		REDUCEX
1.1Smycroft
1.1Smycroft
1.1SmycroftSCSM:
1.2Smycroft	CLR.W		XDCARE(a6)
1.1Smycroft	FMOVE.S		#:3F800000,FP1
1.1Smycroft
1.1Smycroft	move.l		d1,-(sp)	;save users mode & precision
1.1Smycroft	andi.l		#$ff,d1		;mask off all exceptions
1.1Smycroft	fmove.l		d1,FPCR
1.1Smycroft	FSUB.S		#:00800000,FP1
1.1Smycroft	bsr		sto_cos		;store cosine result
1.1Smycroft	FMOVE.L		(sp)+,FPCR	;restore users exceptions
1.1Smycroft	FMOVE.X		X(a6),FP0
1.1Smycroft	bra		t_frcinx
1.1Smycroft
1.1Smycroft	end