x_unfl.sa revision 1.1
1* MOTOROLA MICROPROCESSOR & MEMORY TECHNOLOGY GROUP 2* M68000 Hi-Performance Microprocessor Division 3* M68040 Software Package 4* 5* M68040 Software Package Copyright (c) 1993, 1994 Motorola Inc. 6* All rights reserved. 7* 8* THE SOFTWARE is provided on an "AS IS" basis and without warranty. 9* To the maximum extent permitted by applicable law, 10* MOTOROLA DISCLAIMS ALL WARRANTIES WHETHER EXPRESS OR IMPLIED, 11* INCLUDING IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A 12* PARTICULAR PURPOSE and any warranty against infringement with 13* regard to the SOFTWARE (INCLUDING ANY MODIFIED VERSIONS THEREOF) 14* and any accompanying written materials. 15* 16* To the maximum extent permitted by applicable law, 17* IN NO EVENT SHALL MOTOROLA BE LIABLE FOR ANY DAMAGES WHATSOEVER 18* (INCLUDING WITHOUT LIMITATION, DAMAGES FOR LOSS OF BUSINESS 19* PROFITS, BUSINESS INTERRUPTION, LOSS OF BUSINESS INFORMATION, OR 20* OTHER PECUNIARY LOSS) ARISING OF THE USE OR INABILITY TO USE THE 21* SOFTWARE. Motorola assumes no responsibility for the maintenance 22* and support of the SOFTWARE. 23* 24* You are hereby granted a copyright license to use, modify, and 25* distribute the SOFTWARE so long as this entire notice is retained 26* without alteration in any modified and/or redistributed versions, 27* and that such modified versions are clearly identified as such. 28* No licenses are granted by implication, estoppel or otherwise 29* under any patents or trademarks of Motorola, Inc. 30 31* 32* x_unfl.sa 3.4 7/1/91 33* 34* fpsp_unfl --- FPSP handler for underflow exception 35* 36* Trap disabled results 37* For 881/2 compatibility, sw must denormalize the intermediate 38* result, then store the result. Denormalization is accomplished 39* by taking the intermediate result (which is always normalized) and 40* shifting the mantissa right while incrementing the exponent until 41* it is equal to the denormalized exponent for the destination 42* format. After denormalizatoin, the result is rounded to the 43* destination format. 44* 45* Trap enabled results 46* All trap disabled code applies. In addition the exceptional 47* operand needs to made available to the user with a bias of $6000 48* added to the exponent. 49* 50 51X_UNFL IDNT 2,1 Motorola 040 Floating Point Software Package 52 53 section 8 54 55 include fpsp.h 56 57 xref denorm 58 xref round 59 xref store 60 xref g_rndpr 61 xref g_opcls 62 xref g_dfmtou 63 xref real_unfl 64 xref real_inex 65 xref fpsp_done 66 xref b1238_fix 67 68 xdef fpsp_unfl 69fpsp_unfl: 70 link a6,#-LOCAL_SIZE 71 fsave -(a7) 72 movem.l d0-d1/a0-a1,USER_DA(a6) 73 fmovem.x fp0-fp3,USER_FP0(a6) 74 fmovem.l fpcr/fpsr/fpiar,USER_FPCR(a6) 75 76* 77 bsr.l unf_res ;denormalize, round & store interm op 78* 79* If underflow exceptions are not enabled, check for inexact 80* exception 81* 82 btst.b #unfl_bit,FPCR_ENABLE(a6) 83 beq.b ck_inex 84 85 btst.b #E3,E_BYTE(a6) 86 beq.b no_e3_1 87* 88* Clear dirty bit on dest resister in the frame before branching 89* to b1238_fix. 90* 91 bfextu CMDREG3B(a6){6:3},d0 ;get dest reg no 92 bclr.b d0,FPR_DIRTY_BITS(a6) ;clr dest dirty bit 93 bsr.l b1238_fix ;test for bug1238 case 94 move.l USER_FPSR(a6),FPSR_SHADOW(a6) 95 or.l #sx_mask,E_BYTE(a6) 96no_e3_1: 97 movem.l USER_DA(a6),d0-d1/a0-a1 98 fmovem.x USER_FP0(a6),fp0-fp3 99 fmovem.l USER_FPCR(a6),fpcr/fpsr/fpiar 100 frestore (a7)+ 101 unlk a6 102 bra.l real_unfl 103* 104* It is possible to have either inex2 or inex1 exceptions with the 105* unfl. If the inex enable bit is set in the FPCR, and either 106* inex2 or inex1 occured, we must clean up and branch to the 107* real inex handler. 108* 109ck_inex: 110 move.b FPCR_ENABLE(a6),d0 111 and.b FPSR_EXCEPT(a6),d0 112 andi.b #$3,d0 113 beq.b unfl_done 114 115* 116* Inexact enabled and reported, and we must take an inexact exception 117* 118take_inex: 119 btst.b #E3,E_BYTE(a6) 120 beq.b no_e3_2 121* 122* Clear dirty bit on dest resister in the frame before branching 123* to b1238_fix. 124* 125 bfextu CMDREG3B(a6){6:3},d0 ;get dest reg no 126 bclr.b d0,FPR_DIRTY_BITS(a6) ;clr dest dirty bit 127 bsr.l b1238_fix ;test for bug1238 case 128 move.l USER_FPSR(a6),FPSR_SHADOW(a6) 129 or.l #sx_mask,E_BYTE(a6) 130no_e3_2: 131 move.b #INEX_VEC,EXC_VEC+1(a6) 132 movem.l USER_DA(a6),d0-d1/a0-a1 133 fmovem.x USER_FP0(a6),fp0-fp3 134 fmovem.l USER_FPCR(a6),fpcr/fpsr/fpiar 135 frestore (a7)+ 136 unlk a6 137 bra.l real_inex 138 139unfl_done: 140 bclr.b #E3,E_BYTE(a6) 141 beq.b e1_set ;if set then branch 142* 143* Clear dirty bit on dest resister in the frame before branching 144* to b1238_fix. 145* 146 bfextu CMDREG3B(a6){6:3},d0 ;get dest reg no 147 bclr.b d0,FPR_DIRTY_BITS(a6) ;clr dest dirty bit 148 bsr.l b1238_fix ;test for bug1238 case 149 move.l USER_FPSR(a6),FPSR_SHADOW(a6) 150 or.l #sx_mask,E_BYTE(a6) 151 movem.l USER_DA(a6),d0-d1/a0-a1 152 fmovem.x USER_FP0(a6),fp0-fp3 153 fmovem.l USER_FPCR(a6),fpcr/fpsr/fpiar 154 frestore (a7)+ 155 unlk a6 156 bra.l fpsp_done 157e1_set: 158 movem.l USER_DA(a6),d0-d1/a0-a1 159 fmovem.x USER_FP0(a6),fp0-fp3 160 fmovem.l USER_FPCR(a6),fpcr/fpsr/fpiar 161 unlk a6 162 bra.l fpsp_done 163* 164* unf_res --- underflow result calculation 165* 166unf_res: 167 bsr.l g_rndpr ;returns RND_PREC in d0 0=ext, 168* ;1=sgl, 2=dbl 169* ;we need the RND_PREC in the 170* ;upper word for round 171 move.w #0,-(a7) 172 move.w d0,-(a7) ;copy RND_PREC to stack 173* 174* 175* If the exception bit set is E3, the exceptional operand from the 176* fpu is in WBTEMP; else it is in FPTEMP. 177* 178 btst.b #E3,E_BYTE(a6) 179 beq.b unf_E1 180unf_E3: 181 lea WBTEMP(a6),a0 ;a0 now points to operand 182* 183* Test for fsgldiv and fsglmul. If the inst was one of these, then 184* force the precision to extended for the denorm routine. Use 185* the user's precision for the round routine. 186* 187 move.w CMDREG3B(a6),d1 ;check for fsgldiv or fsglmul 188 andi.w #$7f,d1 189 cmpi.w #$30,d1 ;check for sgldiv 190 beq.b unf_sgl 191 cmpi.w #$33,d1 ;check for sglmul 192 bne.b unf_cont ;if not, use fpcr prec in round 193unf_sgl: 194 clr.l d0 195 move.w #$1,(a7) ;override g_rndpr precision 196* ;force single 197 bra.b unf_cont 198unf_E1: 199 lea FPTEMP(a6),a0 ;a0 now points to operand 200unf_cont: 201 bclr.b #sign_bit,LOCAL_EX(a0) ;clear sign bit 202 sne LOCAL_SGN(a0) ;store sign 203 204 bsr.l denorm ;returns denorm, a0 points to it 205* 206* WARNING: 207* ;d0 has guard,round sticky bit 208* ;make sure that it is not corrupted 209* ;before it reaches the round subroutine 210* ;also ensure that a0 isn't corrupted 211 212* 213* Set up d1 for round subroutine d1 contains the PREC/MODE 214* information respectively on upper/lower register halves. 215* 216 bfextu FPCR_MODE(a6){2:2},d1 ;get mode from FPCR 217* ;mode in lower d1 218 add.l (a7)+,d1 ;merge PREC/MODE 219* 220* WARNING: a0 and d0 are assumed to be intact between the denorm and 221* round subroutines. All code between these two subroutines 222* must not corrupt a0 and d0. 223* 224* 225* Perform Round 226* Input: a0 points to input operand 227* d0{31:29} has guard, round, sticky 228* d1{01:00} has rounding mode 229* d1{17:16} has rounding precision 230* Output: a0 points to rounded operand 231* 232 233 bsr.l round ;returns rounded denorm at (a0) 234* 235* Differentiate between store to memory vs. store to register 236* 237unf_store: 238 bsr.l g_opcls ;returns opclass in d0{2:0} 239 cmpi.b #$3,d0 240 bne.b not_opc011 241* 242* At this point, a store to memory is pending 243* 244opc011: 245 bsr.l g_dfmtou 246 tst.b d0 247 beq.b ext_opc011 ;If extended, do not subtract 248* ;If destination format is sgl/dbl, 249 tst.b LOCAL_HI(a0) ;If rounded result is normal,don't 250* ;subtract 251 bmi.b ext_opc011 252 subq.w #1,LOCAL_EX(a0) ;account for denorm bias vs. 253* ;normalized bias 254* ; normalized denormalized 255* ;single $7f $7e 256* ;double $3ff $3fe 257* 258ext_opc011: 259 bsr.l store ;stores to memory 260 bra.b unf_done ;finish up 261 262* 263* At this point, a store to a float register is pending 264* 265not_opc011: 266 bsr.l store ;stores to float register 267* ;a0 is not corrupted on a store to a 268* ;float register. 269* 270* Set the condition codes according to result 271* 272 tst.l LOCAL_HI(a0) ;check upper mantissa 273 bne.b ck_sgn 274 tst.l LOCAL_LO(a0) ;check lower mantissa 275 bne.b ck_sgn 276 bset.b #z_bit,FPSR_CC(a6) ;set condition codes if zero 277ck_sgn: 278 btst.b #sign_bit,LOCAL_EX(a0) ;check the sign bit 279 beq.b unf_done 280 bset.b #neg_bit,FPSR_CC(a6) 281 282* 283* Finish. 284* 285unf_done: 286 btst.b #inex2_bit,FPSR_EXCEPT(a6) 287 beq.b no_aunfl 288 bset.b #aunfl_bit,FPSR_AEXCEPT(a6) 289no_aunfl: 290 rts 291 292 end 293