mixed precision confusion
David G. Hough on validgh
dgh
Fri Jun 12 08:39:45 PDT 1992
Thanks to John Gilmore for forwarding this:
From: Nancy Leveson <nancyamurphy.ICS.UCI.EDU>
Subject: Endeavor bug -- more details
>From Aviation Week as quoted by James Paul:
Engineers have traced the problem to the sensitivity of NASA-developed
equations to a particular set of numeric values that arose when Endeavour
was making one of the final computer-targeted rendezvous maneuvers. Test
show the software had been properly coded by IBM and therefore passed all
preflight tests, according to Ted Keller, senior technical staff member
at the IBM Shuttle Project Coordination Office, Houston.
Here is some additional information about this event. You can evaluate it
yourselves with respect to the statements in AW.
The STS-49 failure of the flight software to converge during targeting has been
traced to the Lambert targeting routine. The associated algorithms used by the
routine converge on an independent variable called "U" which is a double
precision scalar. U is iterated (up to 10 times) via this algorithm. The
algorithm is designed to converge on a value of U between two dynamically
updated limits called U_MIN and U_MAX, which are single precision scalars. On
each iteration, either U_MIN or U_MAX is updated to decrease the interval
within which the algorithm will search for the desired value of U.
To determine which limit to update, the algorithm calculates a variable U_STEP,
the amount by which U will be updated on this iteration. If its value is
positive, U_MIN is set to U. If its value is negative, U_MAX is set to U.
Then U_STEP is added to U, and the resulting value of U is compared to the
limits U_MIN and U_MAX. If U is now outside the limits, U is recalculated as
the average of U_MIN and U_MAX, thereby keeping U within the search interval.
U
|--------|-----------------------|
U_MIN U_MAX
U continues to be updated in this manner on each iteration until convergence is
attained or maximum iterations are executed. Convergence occurs if the
normalized transfer time that corresponds to the current value of U is close
enough to the desired transfer time. "Close enough" is a function of a
mission-specific data value.
For the third rendezvous of STS-49, the value of U after the first iteration
was very close to the desired value, and U_MIN was set equal to U because
U_STEP was positive. On the second iteration cycle, U_STEP was smaller thana
one least significant bit (LSB) for U_MIN. Since U_STEP was positive, U_MIN
was set to U, and U_STEP was added to U. Algebraically, U should have been
greater than U_MIN. However, due to precision differences, U_MIN was greater
than U. (Loss of precision occurred when the double precision value of U was
stored into the single precision variable U_MIN.) Therefore, U was recalculated
to be the average of U_MIN and U_MAX, and the search interval no longer
contained the desired value of U.
|<---1 single precision LSB-->|
| |
| |
| U U |
| after after |
| 1st 2nd |
| pass pass* |
|------|---------|------------|
| | |
|<------->| |
U_STEP |
U_MIN
*Prior to recalculations after
2nd pass
Note: both U and U-MIN had negative values
On subsequent iterations, U was updated in the direction of the desired value,
but never reached it before maximum iterations occurred because it was outside
the search interval.
To fix the problem and allow the mission to resume, they had to uplink a new
state vector from the ground, by-passing the onboard routine. The permanent
fix involves changing U_MIN and U_MAX to double precision.
More information about the Numeric-interest
mailing list