SPARC V8 Appendix N: troff source again

Tue Jul 31 12:05:23 PDT 1990

The principal content changes are: rs2 NaNs have precedence over rs1 NaNs
instead of vice versa, in order to be uniform with monadic operators whose`
operand is rs2; and the NS bit of FSR always reads 0 if you don't implement
a different nonstandard mode.

# This is a shell archive.  Remove anything before this line,
# then unpack it by saving it in a file and typing "sh file".
#
# Wrapped by validgh!dgh on Tue Jul 31 12:04:27 PDT 1990
# Contents:  app.n.ieee_fp.t

echo x - app.n.ieee_fp.t
sed 's/^a//' > "app.n.ieee_fp.t" <<'a//E*O*F app.n.ieee_fp.t//'
a.\"	a(#)	90/07/31 	app.n.ieee_fp.t 	1.2	(c) 1990 Sun Microsystems, Inc
a.\"============================================================================
a.\"   tbl app.n.ieee_fp.t  | eqn | troff -ms
a.\" --------------------------------------------------------
a.if !\nX=0  \{\
a.	nr aP 1
a.	nr H1 13\"...N...
a.	af H1 A
a.\}
a.\" --------------------------------------------------------
a.de p
a.LP
a..
a.\" --------------------------------------------------------
a.H A "SPARC IEEE 754 Implementation Recommendations"
a.LP
a.PL RIGHT
a.if n \{\
a.ll 7.8i
a.pl 99i
a.nh
a.na
a.nr LL 7.8i
a.de PT
a.\"
a..
a.de BT
a.\"
a..
a.\}
a.LP
A number of details in IEEE 754 are left to be defined by implementations,
and so are undefined in this document.
In order to promote better portability among new SPARC
implementations of programs such as instruction set test vectors,
the following recommendations eliminate many uncertainties,
especially with regard to exceptional situations.
These recommendations,
perhaps modified slightly in the light of subsequent experience,
are intended to be incorporated as requirements in a future SPARC revision.
a.\" --------------------------------------------------------
a.H 2 "Unaligned floating-point data registers"
a.LP
The effect of executing an instruction that refers to an unaligned 
floating-point register operand
a.IX "f registers" "" "\fIf\fP registers"
a.IX "registers" "f registers" "" "\fIf\fP registers"
(double-precision operand in a register not 0 mod 2,
or quadruple-precision operand in a register not 0 mod 4)
is undefined in Section 4.3.
a.\""" Such instructions are generated by software errors that are
a.\""" difficult to detect.
An illegal_instruction trap occurs in this case.
a.IX "exceptions" "illegal_instruction"
a.IX "illegal_instruction exception"
a.\" --------------------------------------------------------
a.H 2 "Reading an empty FQ"
a.LP
The effect of reading an empty floating-point queue is not specified in
Chapter 4.
A trap handler which attempts to read such a queue contains a software error.
A sequence_error fp_exception trap occurs in this case.
a.\" --------------------------------------------------------
a.H 2 "Traps inhibit results"
a.LP
To summarize what happens when a floating-point trap occurs,
as described in section 4.4 and elsewhere:
a.IP \(bu 3
the destination \fIf\fP register is unchanged
a.IX "f registers" "" "\fIf\fP registers"
a.IX "registers" "f registers" "" "\fIf\fP registers"
a.IP \(bu 3
the FSR \fIfcc\fP (floating-point condition codes) field is unchanged
a.IX "registers" "floating-point state register (FSR)"
a.IX "floating-point state register (FSR)"
a.IP \(bu 3
the FSR \fIaexc\fP (accrued exceptions) field is unchanged
a.IX "accrued exception (\fIaexc\fP) field of FSR register"
a.IP \(bu 3
the FSR \fIcexc\fP (current exceptions) field is unchanged except for
a.IX "current exception (\fIcexc\fP) field of FSR register"
IEEE_754_exception;
a.IX "IEEE_754_exception floating-point trap type"
a.IX "floating-point trap types" "IEEE_754_exception"
in that case, \fIcexc\fP contains exactly one bit which is 1,
corresponding to the exception that caused the trap
a.LP
These restrictions are designed to ensure a consistent state for user software.
Instructions causing an fp_exception trap
a.IX "exceptions" "fp_exception"
a.IX "fp_exception exception"
due to unfinished or unimplemented FPops execute as if by hardware;
a.IX "unfinished_FPop floating-point trap type"
a.IX "floating-point trap types" "unfinished_FPop"
a.IX "unimplemented_FPop floating-point trap type"
a.IX "floating-point trap types" "unimplemented_FPop"
that a hardware trap was taken to supervisor software
is undetectable by user software except possibly by timing considerations.
A user-mode trap handler invoked for an IEEE 754 exception,
whether as a direct result of a hardware IEEE_754_exception
or as indirect result of supervisor handling of an unfinished_FPop
a.IX "unfinished_FPop floating-point trap type"
a.IX "floating-point trap types" "unfinished_FPop"
or unimplemented_FPop,
a.IX "unimplemented_FPop floating-point trap type"
a.IX "floating-point trap types" "unimplemented_FPop"
may rely on the following:
a.IP \(bu 3
supervisor software will pass it the address of the instruction which caused
the exception, 
extracted from a deferred trap queue or elsewhere
a.IP \(bu 3
the destination \fIf\fP register is unchanged from its state prior to that
instruction's execution
a.IP \(bu 3
the FSR \fIfcc\fP is unchanged
a.IX "floating-point condition codes (\fIfcc\fP) field of FSR register"
a.IP \(bu 3
the FSR \fIaexc\fP is unchanged
a.IX "accrued exception (\fIaexc\fP) field of FSR register"
a.IP \(bu 3
the FSR \fIcexc\fP contains one bit on for the exception that caused the trap
a.IX "current exception (\fIcexc\fP) field of FSR register"
a.IP \(bu 3
the FSR \fIftt\fP, \fIqne\fP, \fIu\fP, and \fIres\fP fields are zero
a.IX "floating-point trap type (\fIftt\fP) field of FSR register"
a.IX "queue not empty (\fIqne\fP) bit of FSR register"
a.LP
Supervisor software is responsible for enforcing these requirements
if the hardware trap mechanism does not.
a.\" --------------------------------------------------------
a.H 2 "NaN operand and result definitions"
a.LP
An untrapped floating-point result can be in a format which is either 
the same as, or different from, the format of the source operands.
These two cases are described separately, below.
a.H 3 "Untrapped floating-point result in different format from operands"
a.LP
a.IX "convert between floating-point formats instructions" ""
a.IX "FsTOd instruction"
a.IX "FsTOq instruction"
a.IX "FdTOs instruction"
a.IX "FdTOq instruction"
a.IX "FqTOs instruction"
a.IX "FqTOd instruction"
a.IX "instructions" "convert between floating-point formats"
a.IX "instructions" "FsTOd"
a.IX "instructions" "FsTOq"
a.IX "instructions" "FdTOs"
a.IX "instructions" "FdTOq"
a.IX "instructions" "FqTOs"
a.IX "instructions" "FqTOd"
F[sdq]TO[sdq], with quiet NaN operand:
a.IX "quiet NaN (not-a-number)"
a.IX "NaN (not-a-number)" "quiet"
no exception caused;
result is a quiet NaN.
The operand is transformed as follows:
a.br
a.KS
a.IP
a.B NaN
a.IX "NaN (not-a-number)"
transformation:
The most significant bits of the operand
fraction are copied to the most significant bits of the result fraction.
When converting to a narrower format,
excess lower order bits of the operand fraction are discarded.
When converting to a wider format,
excess lower order bits of the result fraction are set to 0.
The quiet bit (most significant bit of the result fraction) is always set to 1,
so the NaN transformation always produces a quiet NaN.
a.KE
a.LP
F[sdq]TO[sdq], signaling NaN operand:
a.IX "signaling NaN (not-a-number)"
a.IX "NaN (not-a-number)" "signaling"
invalid exception, result is the signaling NaN operand processed by the 
a.B NaN
transformation above to produce a quiet NaN.
a.LP
a.IX "floating-point compare instructions" 
a.IX "FCMPEs instruction" 
a.IX "FCMPEd instruction" 
a.IX "FCMPEq instruction" 
a.IX "instructions" "floating-point compare"
a.IX "instructions" "FCMPEs"
a.IX "instructions" "FCMPEd"
a.IX "instructions" "FCMPEq"
FCMPE[sdq] with any NaN operand:
invalid exception, unordered \fIfcc\fP.
a.LP
a.IX "floating-point compare instructions" 
a.IX "FCMPs instruction" 
a.IX "FCMPd instruction" 
a.IX "FCMPq instruction" 
a.IX "instructions" "floating-point compare"
a.IX "instructions" "FCMPs"
a.IX "instructions" "FCMPd"
a.IX "instructions" "FCMPq"
FCMP[sdq] with any signaling NaN operand:
a.IX "signaling NaN (not-a-number)"
a.IX "NaN (not-a-number)" "signaling"
invalid exception, unordered \fIfcc\fP.
a.LP
FCMP[sdq] with any quiet NaN operand but no signaling NaN operand:
no exception, unordered \fIfcc\fP.
a.\" --------------------------------------------------------
a.H 3 "Untrapped floating-point result in same format as operands"
a.LP
No NaN operand:
for an invalid exception such as sqrt(\-1.0)  or  0.0\|\(di\|0.0, 
the result is the quiet NaN with
sign = 0, exponent = all 1's, and fraction = all 1's.
The sign is 0 to distinguish such results from storage initialized to
all `1' bits.
a.LP
One operand, quiet NaN:
a.IX "quiet NaN (not-a-number)"
a.IX "NaN (not-a-number)" "quiet"
no exception, result is the quiet NaN operand.
a.LP
One operand, signaling NaN: invalid exception, result is that 
a.IX "signaling NaN (not-a-number)"
a.IX "NaN (not-a-number)" "signaling"
signaling NaN with the quiet bit
(most significant bit of fraction field) set to 1.
a.LP
Two operands, both quiet:
no exception, result is the rs2 (second source) operand.
a.LP
Two operands, both signaling:
invalid exception, result is the rs2 operand with the quiet bit set to 1.
a.LP
Two operands, just one a signaling NaN:
invalid exception, result is the signaling
NaN operand with the quiet bit set to 1.
a.LP
Two operands, neither signaling NaN, just one a quiet NaN:
no exception, result is the quiet NaN operand. 
a.LP
In the following tabular representation of the untrapped results, 
NaN\fIn\fP means the NaN in rs\fIn\fP, Q means quiet, S signaling:
a.in +0.5i
a.TS
c	c     |	c	s	s
c	c     |	c	s	s
c	c     |	c	c	c
c     |	c     |	c	c	c
c     |	c     |	c	c	c.
		rs2 operand
		_
		number  	QNaN2	SNaN2
_
	none	IEEE 754	QNaN2	QSNaN2
rs1	number	IEEE 754	QNaN2	QSNaN2
operand	QNaN1	 QNaN1  	QNaN2	QSNaN2	
	SNaN1	 QSNaN1 	QSNaN1	QSNaN2
a.TE
a.in -0.5i
QSNaN\fIn\fP means a quiet NaN produced by the \fBNaN\fP transformation
on a signaling NaN from rs\fIn\fP;
the invalid exception is always signaled.
The QNaN\fIn\fP results in the table never generate an exception,
but IEEE 754 specifies a number of cases of invalid exceptions
and QNaN results from operands that are both numbers.
a.\" --------------------------------------------------------
a.H 2 "Trapped Underflow definition (UFM=1)"
a.LP
a.IX "underflow mask (\fIUFM\fP) bit of TEM field of FSR register"
Underflow occurs if the correct unrounded result has magnitude between zero
and the smallest normalized number in the destination format.
In terms of IEEE 754, this means \*Qtininess detected before rounding\*U.
a.LP
Note that the wrapped exponent results intended to be delivered on 
trapped underflows and overflows in IEEE 754 aren't relevant to SPARC
at the hardware/supervisor levels; if they are created at all then it would
be by user software in a user-mode trap handler.
a.IX "user-mode trap handler"
a.IX "trap handler" "user-mode"
a.\" --------------------------------------------------------
a.H 2 "Untrapped underflow definition (UFM=0)"
a.LP
a.IX "underflow mask (\fIUFM\fP) bit of TEM field of FSR register"
Underflow occurs if the correct unrounded result has magnitude between zero
and the smallest normalized number in the destination format,
\fBand\fP the correctly rounded result in the destination format is inexact;
that result may be zero, subnormal, or the smallest normalized number.
In terms of IEEE 754, this means \*Qtininess detected before rounding\*U and
\*Qloss of accuracy detected as inexact\*U.
a.IX "inexact mask (\fINXM\fP) bit of TEM field of FSR register"
a.LP
Note that floating-point overflow is defined to be detected 
a.B after
rounding;
the foregoing underflow definition simplifies hardware implementation and test.
a.LP
The following table summarizes what happens when an exact \fBunrounded\fP value
a.I u
satisfying
a.IP "" 10
a.sp -1.5v
\fI0 \(<= \fP|\|\fIu\fP\|| \(<= \fIsmallest normalized number\fP
a.LP
a.sp -0.5v
would round, if no trap intervened, to a \fBrounded\fP value 
a.I r
which might be zero, subnormal, or smallest normalized value. 
\*QUF\*U means underflow trap (with ufc set in \fIcexc\fP),
a.IX "underflow current (\fIufc\fP) bit of \fIcexc\fP field of FSR register"
\*QNX\*U means inexact trap (with nxc set in \fIcexc\fP), 
a.IX "inexact current (nxc) bit of \fIcexc\fP field of FSR register"
\*Quf\*U means untrapped underflow exception
(ufc set in \fIcexc\fP and ufa in \fIaexc\fP), and
a.IX "underflow current (\fIufc\fP) bit of \fIcexc\fP field of FSR register"
a.IX "underflow accrued (\fIufa\fP) bit of \fIaexc\fP field of FSR register"
\*Qnx\*U means untrapped inexact exception
(nxc set in \fIcexc\fP and nxa in \fIaexc\fP).
a.IX "inexact current (nxc) bit of cexc field of FSR register"
a.IX "inexact accrued (nxa) bit of aexc field of FSR register"
a.KS
a.TS
l     |	l	r	r	r	r .
	underflow trap	UFM=1	UFM=0	UFM=0
	inexact trap	NXM=?	NXM=1	NXM=0
_
          	\fIr\fP is minimum normal	none	none	none
\fIu = r\fP	\fIr\fP is subnormal         	UF	none	none
          	\fIr\fP is zero          	none	none	none
_
              	\fIr\fP is minimum normal	UF	NX	uf nx
\fIu \(!= r\fP	\fIr\fP is subnormal    	UF	NX	uf nx
              	\fIr\fP is zero         	UF	NX	uf nx
a.TE
a.KE
a.\" --------------------------------------------------------
a.H 2 "Integer overflow definition"
a.LP
a.IX "convert floating-point to integer instructions"
a.IX "FsTOi instruction"
a.IX "FdTOi instruction"
a.IX "FqTOi instruction"
a.IX "instructions" "convert floating-point to integer"
a.IX "instructions" "FsTOi"
a.IX "instructions" "FdTOi"
a.IX "instructions" "FqTOi"
F[sdq]TOi:
when a NaN,
a.IX "NaN (not-a-number)"
infinity,
a.IX "infinity"
a.IX "negative infinity"
a.IX "positive infinity"
large positive argument \(>= 2147483648.0,
or large negative argument \(<= \-2147483649.0,
is converted to integer, the exception is invalid.
If no trap occurs and the sign bit of the operand is positive (is 0),
the numerical result is 2147483647.
If no trap occurs and the sign bit of the operand is negative (is 1),
the numerical result is \-2147483648.
a.\" --------------------------------------------------------
a.SH
a.H 2 "Nonstandard mode"
a.LP
SPARC implementations are permitted but not encouraged to deviate
from SPARC requirements when the nonstandard mode bit of the FSR is 1.
a.IX "nonstandard floating-point (NS) field of FSR register"
Some implementations use that bit to provide alternative handling of
subnormal floating-point operands and results that avoids unfinished_FPop traps
a.IX "unfinished_FPop floating-point trap type"
a.IX "floating-point trap types" "unfinished_FPop"
with consequent poor performance on programs that underflow frequently.
a.LP
Such traps could be avoided by proper system design.
Cache misses in the CPU cause holds in the FPU,
a.IX "cache" "misses and floating-point performance"
in order for extra cycles to occur to refill the cache,
so that their occurrence is invisible to software and doesn't degrade
performance in the normal cache hit case.
Similarly \*Qsubnormal misses\*U in the FPU can be avoided by any of
several better implementation techniques that avoid causing an
unfinished_FPop trap or degrading performance in the normal case.
One way is to cause subnormal misses in the FPU to hold the CPU,
so that operand or result alignment can take a few extra cycles
without any other effect on software.
Another way to avoid extra cycles is to provide
extra normalization hardware for operands and results.
a.LP
So the best implementation of nonstandard mode is a no-op:
nonstandard mode runs identically to the standard SPARC mode.
Such implementations identify themselves in the NS bit of the FSR,
which always reads 0, even after a 1 is written.
a.LP
 Second best is to implement nonstandard mode for subnormal operands
and results as outlined below so that implementations behave uniformly:
a.RS
a.IP "Subnormal operands"
in nonstandard mode are replaced by zeros with the same sign.
An inexact exception always arises if no other exception would,
and so traps if NXM=1.
a.IP "Untrapped subnormal results"
in nonstandard mode are replaced by zeros with the same sign.
Underflow and inexact exceptions always arise.
In terms of the previous table:
a.KS
a.TS
center ;
l | l r r r r .
	underflow trap	UFM=1	UFM=0	UFM=0
	inexact trap	NXM=?	NXM=1	NXM=0
_
\fIu = r\fP	\fIr\fP is minimum normal	none	none	none
\^	\fIr\fP is zero          	none	none	none
_
\fIu \(!= r\fP	\fIr\fP is minimum normal	UF	NX	uf nx
\^	\fIr\fP is zero         	UF	NX	uf nx
a.TE
a.KE
a.RE
a//E*O*F app.n.ieee_fp.t//
chmod u=r,g=r,o=r app.n.ieee_fp.t

exit 0