Performance Performance Performance

Tom MacDonald uunet!sdiv.cray.com!tam
Mon Jan 16 10:43:45 PST 1995


David Hough writes:
> A different problem confronts us now.    Probably the bulk of
> floating-point operations are performed in PC spreadsheet programs, for
> although each such PC performs few, such PC's are many in number.   Most
> technical floating-point operations are probably performed on workstations
> and workstation-derived servers, which are slower but far more numerous
> than supercomputers.   But the PC's, workstations, and most new
> supercomputer designs all use variations of IEEE 754 binary floating-point
> arithmetic.    So the job of porting mathematical software can be
> considered done for all time.

As you point out, the answer is "well not quite."  Another thing to
consider is that although IEEE format is almost universally implemented,
IEEE arithmetic is not always implemented according to the letter (or
spirit) of the law.  Often times only a subset of the arithmetic is
supported.  A common missing feature is denormal numbers.  These
decisions are made to maximize profits and not to minimize the headaches
associated with porting mathematical software.  If we want to put code
portability on the pedestal and mandate that everything else is of
secondary importance then we have a lot of PR work ahead of us.

I still question if mandating portability stifles creativity too much.
Should we provide some latitude to hardware designers to tweak other
areas (like performance) at the expense of portability.  I worry about
locking ourselves in a box that is too difficult to escape from when
the time is right.  Can we still color outside the lines to see what
it looks like?  Most miles are driven in production automobiles but
they've all benefited from race car technology.

> Well not quite.   Different choices permitted within IEEE 754, different
> expression evaluation paradigms, and different libraries - not to mention
> gross and subtle bugs in optimizing compilers, not to mention the
> underlying hardware - cause identical numerical results on different
> systems to still be the exception rather than the rule, and to the
> justifiable complaint of the typical floating-point user - now more likely
> an accountant or technician than a numerical analyst - technical support
> people often respond that that's just the way floating-point arithmetic is
> - an inherently unpredictable process, like the weather.    All these
> differences don't help most floating-point users, whose performance is
> most often limited by other factors to a far lower level than that at
> which aggressive hardware and software optimizations can help; yet the
> differences induced by those optimizations affect all users, whether they
> obtain a performance benefit or not.    These differences confuse and
> delay the recognition of real significant hardware and software bugs that
> can masquerade for a time as normal roundoff variations.

Being a part of the go-fast end of the computer biz allows me to hear the
concerns of scientists that explore the current edges of science.  A good
example is the weather.  Although we say the weather is unpredictable, we
really mean "we don't know how to accurately predict it."  Although
weather forecasting has improved over the years, no one is worried there
will be no useful research in this area in our lifetimes.

Their main concern is always performance.  They'd be thrilled if their
codes ran 100 times faster.  The message I keep getting from them is to
increase the performance, even if the arithmetic accuracy suffers some.
They want to accurately predict next week's weather more than they want to
"easily" port to another machine.  We've never lost a sale to a major
customer because we optimize too much.  Often times it's the opposite.

Someone has to represent the interests of those that need more
performance.  Performance is important.  Too often the high performance
interests represent a "boutique" part of the industry, but it's important
to understand that the future will someday be the present, and we want to
be prepared when it arrives.  They need a degree of freedom to get there.

Leading edge science seems to be going more and more parallel every day.
Since parallelism is an immature programming paradigm when compared to
sequential programming, portability is even a greater problem.  I make a
similar argument about parallel programming - we do not want to lock
ourselves into one programming paradigm (data parallel, message passing,
SPMD, control parallel, etc.) until we've had more time to understand what
box we might be locking ourselves into.  It is apparent to me that
parallel programs aggravate the portability issue.  It's going to make it
tougher.  If you're concerned about specifying an order for (a op b op c)
then you're in for a rude awakening when you try to do a global sum
reduction across 1000 processors.

Jim Thomas has written an excellent document on floating-point C
extensions.  He has carefully separated strict IEEE conformance from other
floating-point concerns.  I wonder if attempting a two tier approach is
appropriate?  A vendor might have concerns about testing and supporting
multiple environments (though most vendors have a strictly Standard C
mode) but it lends itself to the "experiment your brains out" mode and
"portability is on the pedestal" mode.  We still need the hardware
engineers to provide a reasonable implementation to a strictly
portable mode.

my-$.02-worth-ly yours,

Tom MacDonald
tamacray.com
uunet!cray!tam



More information about the Numeric-interest mailing list