[Cfp-interest 2849] Re: CFP review of TS-4 and TS-5 revisions
David Hough CFP
pcfp at oakapple.net
Thu Aug 24 16:39:40 PDT 2023
Here's some text on a scaled reduction example.
It might be a reference document somewhere, but it seems excessive for
an example in a TS. It, or part of it, would go in section 6 of TS-4.
EXAMPLE
The scaled reduction functions support computing quantities of modest
magnitudes whose intermediate results might well overflow and underflow.
The 1985 754 standard defined hardware machine traps with wrapped exponents
to deal with these computations, but that facility could not be
accessed directly in portable software, and so was replaced in the 2008
revision by scaled reduction operations.
https://grouper.ieee.org/groups/msc/ANSI_IEEE-Std-754-2019/background/traps-and-wraps.txt
One example is the computation of Clebsch-Gordan coefficients
https://en.wikipedia.org/wiki/Table_of_Clebsch%E2%80%93Gordan_coefficients
or Wigner 3-j symbols,
https://en.wikipedia.org/wiki/3-j_symbol
used to add angular momentum in quantum mechanics.
Expressions for these quantities involve quotients of products of factorials,
and so are prone to intermediate overflow. One could use
logb and scalbn to separate the intermediate exponent from the intermediate
magnitude, however, adding exponents separately from multiplying magnitudes,
until the end, when the final exponent is applied to the final magnitude
with scalb. And that's one possible implementation of the scaled
reduction operations.
As a simplified illustration, consider computing n1!n2!/n3! -
int n1, n2, n3 ;
int i ;
double num1, num2, den1, quot ;
# hazardous
num1 = 1 ;
for ( i := 2 ; i <= n1 ; i++ ) {
num1 *= i ; }
num2 = 1 ;
for ( i := 2 ; i <= n2 ; i++ ) {
num2 *= i ; }
den1 = 1 ;
for ( i := 2 ; i <= n3 ; i++ ) {
den1 *= i ; }
quot = num1 * num2 / den1 ;
# scaled with logb/scalb
int num1e, num2e, den1e, lin ;
num1 = 1 ; num1e = 0 ;
for ( i := 2 ; i <= n1 ; i++ ) {
lin = logb(double(i)) ; num1e += lin ; num1 *= scalb((double(i),-lin) }
num2 = 1 ; num2e = 0 ;
for ( i := 2 ; i <= n2 ; i++ ) {
lin = logb(double(i)) ; num2e += lin ; num2 *= scalb((double(i),-lin) }
den1 = 1 ; den1e = 0 ;
for ( i := 2 ; i <= n3 ; i++ ) {
lin = logb(double(i)) ; den1e += lin ; den1 *= scalb((double(i),-lin) }
quot = scalb( num1 * num2 / den1, num1e + num2e - den1e );
# scaled with scaled_prod
double num1[n1}, num2[n2}, den1[n3] ;
long long num1e, num2e, den1e ;
for ( i := 2 ; i <= n1 ; i++ ) {
num1[i-2] = i ; }
scaled_prod( n1-1, num1, &num1e ) ;
for ( i := 2 ; i <= n2 ; i++ ) {
num2[i-2] = i ; }
scaled_prod( n2-1, num2, &num2e ) ;
for ( i := 2 ; i <= n3 ; i++ ) {
den1[i-2] = i ; }
scaled_prod( n3-1, den1, &den1e ) ;
quot = scalb( num1 * num2 / den1, num1e + num2e - den1e) ;
# of course, if all the intermediates were factorials, one might
# simply cache them in scaled form as needed, or even use a precomputed table
# if the maximum size were known.
double facmag[n1+n2+n3] ;
long long facscale[n1+n2+n3] ;
quot = scalb( facmag[n1] * facmag[n2] / facmag[n3], facscale[n1] + facscale[n2] - facscale[n3] ) ;
This small part of the Clebsch-Gordan/Wigner computation - there are also
square roots and summations - is fairly easy to
understand as an example, but other computations don't lend themselves to
such easy cacheing. Thus the scaled_prodsum and scaled_proddiff functions
accommodate factors like (x[i]+y[i]) and (x[i]-y[i]) respectively.
More information about the Cfp-interest
mailing list