[Cfp-interest 2851] Re: CFP review of TS-4 and TS-5 revisions

David Hough CFP pcfp at oakapple.net
Thu Aug 24 20:29:04 PDT 2023


I regret sending a half-baked text which revealed numerous typos as soon 
as I sent it.

The following might be as much as 51%-baked -

EXAMPLE

The augmented operation are useful for extending precision, particularly
for implementing "double-double" arithmetic, which provides a faster, though
less precise and less predictable, alternative to software quadruple precision,
on systems which lack hardware quadruple precision.

Double-double represents numbers as a pair of doubles, the second much smaller
in magnitude than the first.    Ideally, in a pair (a.h,a.t), a.h would
contain the correctly rounded result of computed(a.h+a.t), 
and a.t would contain the correctly rounded result of computed(a.h+a.t)-a.h.
Performance considerations often compromise this ideal.

The augmented operation make it possible to implement + - * for
double-double format, in a way that is commutative and avoids conditional
branches, yet still yields close to the intended results, and is competitive
in performance if supported in hardware.

Let {a.h;a.t} represent a double-double value a intended to contain the value
a.h+a.t in infinite precision, and {b.t;b.t} a similar double-double value b.

SUM

Consider computing the sum {z.h;z.t}, a double-double value z
intended to contain {a.h+a.t}+{b.h+b.t} as closely as possible.

Clearly z should be a.h + a.t + b.h + b.t, but this is might be
a higher precision value that has to be rounded to fit double-double.

Using the notation
	{x.h;x.t} = v.h+w.h         # augmul
to mean something like
	struct x daug_t ;
	{x.h;x.t} = augadd(v.h, w.h) ;

Let the algorithm be

				# infinite result is a.h+a.t+b.h+b.t
	{u.h;u.t} = a.h+b.t	# augadd
				# infinite result is u.h+u.t+a.t+b.t
	{v.h;v.t} = a.t+b.t	# augadd
				# infinite result is u.h+u.t+v.h+v.t
	{w.h;w.t} = u.t+v.t	# augadd
				# infinite result is u.h+v.h+w.h+w.t
	{x.h;x.t} = v.h+w.h	# augadd
				# infinite result is u.h+x.h+x.t+w.t
	{z.h;z.t} = u.h+x.h	# augadd
				# infinite result is z.h+z.t+x.t+w.t

Now {z.h;z.t} is a pretty good approximation to the intended result, 
with error x.t+w.t, and is
commutative and has no conditional branches.    Its edge cases might be no
worse than those of any other double-double implementation fast enough to 
be useful.    Its performance might be acceptable is hardware 
augadd is no slower than two adds.

PRODUCT

Consider computing the product {z.h;z.t}, a double-double value z 
intended to contain {a.h+a.t}*{b.t;b.t} is closely as possible.

Clearly z should be a.h*b.h + a.h*b.t + a.t*b.h + a.t*b.t but this is 
likely to be a quad-double precision value that has to be rounded to 
fit double-double.

Using the notation
	{x.h;x.t} = a.h*b.h         # augmul
to mean something like
	struct x daug_t ;
	{x.h;x.t} = augmul(a.h, b.h) ;

Let the algorithm be

				# infinite result is 
				# a.h*b.h+a.h*b.t+a.t*b.h+a.t*b.t 
	u=a.h*b.t ; v=a.t*b.t	# regular multiplication
	w = u + v		# regular addition
				# infinite result is 
				# a.h*b.h+w+(a.h*b.t+a.t*b.h+a.t*b.t-w)
	{x.h;x.t} = a.h*b.h	# augmul
				# infinite result is 
				# x.h+x.t+w+(a.h*b.t+a.t*b.h+a.t*b.t-w)
	{y.h;y.t} = x.t+w	# augadd
				# infinite result is 
				# x.h+y.h+y.t+(a.h*b.t+a.t*b.h+a.t*b.t-w)
	{z.h;z.t} = x.h+y.h	# augadd
				# infinite result is 
				# z.h+z.t+y.t+(a.h*b.t+a.t*b.h+a.t*b.t-w)

Now {z.h;z.t} is a pretty good approximation to the intended result, 
with error y.t+(a.h*b.t+a.t*b.h+a.t*b.t-w),
and is
commutative and has no conditional branches.    Its edge cases might be no
worse than those of any other double-double implementation fast enough to 
be useful.    Its performance might be acceptable is hardware augmul and
augadd are no slower than two muls or two adds.



More information about the Cfp-interest mailing list