Difficulties installing LAPACK

Wed Oct 20 12:17:22 PDT 1993

This is in regard to David Hough's response to berndatc1.chemie.uni-bielefeld.de
about his difficulties in installing LAPACK on an 
HP 9000/735 workstation under HP-UX 9.01. I am one of the LAPACK developers,
and have two comments.

1) Did the LAPACK user send his problems to lapackacs.utk.edu and ask for
help, as suggested in the manual?

2) David Hough suggests that the difficulty might arise in the use of
fast but unreliable complex division, i.e. without a test and branch to
avoid overflow. In our development we only encountered this bad arithmetic
on the Cray. So we decided that just on the Cray, we would
set the over/underflow thresholds to be the square root of their normal
values, since this would cause the driver routines to scale input data
to lie in a range very unlikely to cause this trouble. We detected
``Cray-ness'' by looking at the exponent range, which is much larger
on the Cray than any other machine. Since the HP 9000/735 uses IEEE arithmetic,
this test would not work on the HP 9000 (see subroutine SLABAD). You might
be able to fix the HP 9000 installation just by forcing SLABAD to take the
square root.

The LAPACK test code tests matrices close to overflow and underflow, which is
why you noticed this. You might never see it on "normal" input data.

Ideally, we would have wanted to detect this bad complex arithmetic at run
time more reliably. But the current LAPACK subroutine SLAMCH for determining
machine paramters (like roundoff, underflow...) at run time is already
850 lines long, and we chose not to try to develop a "bad complex division
detector" too. Indeed there are enormous packages around for detecting
properties of arithmetic at runtime, which have taken clever people years
to write, because of the amazing variety of bad things the hardware and compiler
can do. Ideally I would prefer to design for a future of good arithmetic, 
not for the past, but your HP experience shows the future is not all it could
be. There are numerous other examples too, due to ``doubled double'' arithmetic,
etc. A standard way to inquire about these things would be great, but
unfortunately the most recent proposal (the proposed ISO Language Independent
Arithemtic standard) does not address these issues adequately. That is another
long story.

My personal opinion is that it is a waste of resources to have a large number
of programmers work around the bad decision of a small number of others.

Scaling will not fix all the problems unreliable complex division will cause.
But using the more careful complex division could cause a slow down on
some pipelined architectures. On a system with IEEE exception handling,
one could do the following:

1) try the fast algorithm; usually it will work
2) if an exception arises or the result is wrong, rerun using a slower,
   more reliable algorithm.

Since the first edition of LAPACK was designed to be portable to non IEEE
machines, this was not an option. Nor did we have the option of changing
the underlying implementation of complex arithmetic. For some example of
how to accelerate numerical algorithms by exploiting exception handling,
see 

aINPROCEEDINGS{demmelli93,
      AUTHOR = {Demmel, J. and Li, X.},
      TITLE = {Faster Numerical Algorithms via Exception Handling},
      BOOKTITLE ={Proceedings of the 11th Symposium on Computer Arithmetic},
      PUBLISHER = {{IEEE} Computer Society Press},
      EDITOR = {E. Swartzlander, M. J. Irwin and G. Jullien},
      YEAR = {1993},
      ADDRESS = {Windsor, Ontario},
      MONTH = {June 29 -- July 2},
      NOTE = {available as all.ps.Z
              via anonymous ftp from toe.cs.berkeley.edu, in directory
              pub/tech-reports/cs/csd-93-728; software is csd-93-728.shar.Z} }