(from comp.arch) hopefully my last posting on Linpack condition
David G. Hough on validgh
dgh
Wed Jun 19 17:10:15 PDT 1991
A while back I wrote in comp.arch:
> IEEE 754 floating-point arithmetic was intended to increase the domain
> of problems for which good results could be obtained, and to increase the
> likelihood of getting error indications if good results could not be obtained,
> all without significantly degrading performance on "common" cases and
> without significantly increasing total system cost relative to sloppy
> arithmetic.
> Since none of these goals are very quantitative,
> people will argue about how well they've been achieved.
> Part of the problem is that the benchmark programs in use measure only
> common cases.
[for instance]
> Various versions of the Linpack benchmark all have in common
> that the data is taken from a uniform random distribution, producing problems
> of very good condition. So the worst possible linear
> equation solver algorithm running on the dirtiest possible floating-point
> hardware should be able to produce a reasonably small residual, even for
> input problems of very large dimension.
[I should have said "the worst likely linear equation solver algorithm -
with no pivoting..." since Shearer pointed out that an aggressively bad
algorithm that sought the smallest available pivot would soon fail.]
Although I stand by my point - programs that run acceptably well on
all the common platforms can't prove much about superior robustness of one
arithmetic system over another - some of the Linpack condition
remarks merited further examination.
The test matrices A for the linpack benchmark are generated by
a nominally uniform random congruential linear method
init = 1325
do 30 j = 1,n
do 20 i = 1,n
init = mod(3125*init,65536)
a(i,j) = (init - 32768.0)/16384.0
20 continue
30 continue
of period 2**14. To attribute extremely good condition to such
matrices in general may be an exaggeration,
depending on how much the "random" numbers resemble a
truly random sample from a truly continuous interval.
In any event for the Linpack benchmark matrices, condition numbers are mostly in
the range from 300 to 1000, which is somewhat higher than I'd expected,
although not disastrous, and acceptable for some single-precision applications.
For certain dimensions n, the matrices generated are singular in exact
arithmetic and very ill conditioned in floating-point arithmetic.
These include obvious ones 256, 512, 1024, as well as others like
320 and 800, all of which have rank substantially
smaller than their dimension, as well as 128 itself, which is
more interesting. A 128x128 matrix contains the entire period of the
random number generator exactly once. I don't know if there is a theorem
that such matrices should be singular, but this particular one seems to have
rank 127.
Aside from those exactly singular cases, the worst turned up in my survey
was n=266 which has condition number 129540.
The condition number figures in a bound for the magnification of error in x
from errors in b or A, for general b. I suspect that it overstates the
susceptibility of x to changes in A or b for b of the special form generated
in the Linpack benchmark, where bi is computed as the sum of aji, corresponding
to an exact solution x consisting of a vector of 1's.
In general terms, this method
of choosing b tends to immunize x against the ill condition of A by insuring
that ||b|| = ||Ax|| is not very much less than ||A|| ||x||, compared to choosing
an x like the last right singular vector of A, for which ||b|| would
be much smaller than ||A|| ||x||. Greater insight than mine would be
required to definitely refute or confirm the foregoing.
But this method of generating data does seem to maximize the chances of
getting acceptable numerical results even with short sloppy arithmetic,
and perhaps even with unstable algorithms: how much faster does a Linpack
1000x1000 benchmark run if no pivoting is employed? Not much fortunately.
When I investigated gaussian elimination without pivoting on the Linpack
benchmark data, I found the effect on the residual to be smaller than I
might have guessed:
dimension method minimum normalized error
precision pivot residual x-1
100 single normal 2 1.6 1e-5
100 single nopivot .13 70 3e-4
1000 single normal 2 11 2e-4
1000 single nopivot 7e-2 2600 8e-2
1000 double normal 2 10 5e-13
1000 double nopivot 6e-2 3900 6e-11
The information tabulated
below was computed with SSVDC from netlib, and indicates
dimension, computation time, the "info" error parameter value,
the largest singular value s1, the smallest singular value sn, the
condition number s1/sn, and the corresponding ratios for the next smaller
singular values, s1/s(n-1) and s1/s(n-2).
Following the SVD data is an anecdote I uncovered from the first time
I noticed the periodicity of matgen.
info
n time s1 sn s1/sn s1/sn-1 s1/sn-2
3 0.0 0 2.6 0.3E+00 10 2 1
4 0.0 0 3.8 0.5E+00 7 3 2
5 0.0 0 4.3 0.2E+00 18 3 2
6 0.0 0 5.3 0.6E+00 9 4 3
7 0.0 0 5.4 0.5E-01 120 15 3
8 0.0 0 5.5 0.2E+00 26 5 4
9 0.0 0 6.8 0.2E+00 45 9 9
10 0.0 0 6.2 0.5E+00 13 11 5
11 0.0 0 6.7 0.2E+00 30 13 7
12 0.0 0 6.2 0.3E+00 20 10 6
13 0.0 0 7.0 0.3E+00 27 7 5
14 0.0 0 7.9 0.3E+00 26 13 7
15 0.0 0 7.5 0.4E-01 182 20 6
16 0.0 0 8.4 0.1E+00 57 8 8
17 0.0 0 8.2 0.6E-01 147 13 8
18 0.0 0 8.4 0.7E-01 127 13 12
19 0.0 0 9.0 0.1E-01 916 64 12
20 0.0 0 9.0 0.2E+00 46 9 7
21 0.0 0 9.4 0.1E+00 64 23 8
22 0.0 0 9.6 0.6E-01 164 23 7
23 0.0 0 10.2 0.2E+00 49 24 13
24 0.0 0 10.6 0.1E+00 89 38 14
25 0.0 0 11.4 0.7E+00 16 11 9
26 0.0 0 10.8 0.2E-01 518 24 10
27 0.0 0 11.3 0.4E-01 313 35 22
28 0.0 0 10.8 0.5E-01 203 36 22
29 0.0 0 12.0 0.7E-02 1825 22 11
30 0.0 0 11.9 0.8E-01 150 22 16
31 0.0 0 12.6 0.2E+00 50 28 23
32 0.0 0 12.5 0.2E+00 82 46 15
33 0.1 0 13.0 0.4E-02 3370 17 16
34 0.1 0 12.6 0.3E+00 49 24 15
35 0.1 0 12.5 0.9E-01 144 35 27
36 0.1 0 13.0 0.3E-01 406 39 24
37 0.1 0 13.6 0.7E-01 196 39 28
38 0.1 0 12.9 0.3E+00 39 26 15
39 0.1 0 14.0 0.2E+00 56 32 25
40 0.1 0 14.4 0.7E-01 193 30 23
41 0.1 0 13.9 0.6E-01 241 42 34
42 0.1 0 14.1 0.5E-01 280 31 20
43 0.1 0 14.8 0.5E-01 300 61 35
44 0.1 0 14.4 0.5E-01 291 59 36
45 0.1 0 14.3 0.3E-01 562 43 22
46 0.1 0 14.7 0.7E-02 2049 51 30
47 0.1 0 15.1 0.5E-01 305 52 25
48 0.1 0 15.0 0.2E+00 91 40 20
49 0.1 0 15.0 0.8E-01 184 91 30
50 0.1 0 15.8 0.3E-01 475 82 41
51 0.1 0 15.4 0.6E-01 250 39 22
52 0.1 0 15.4 0.8E-01 203 43 29
53 0.2 0 16.0 0.3E-01 488 31 28
54 0.2 0 15.5 0.5E-01 285 40 26
55 0.2 0 15.6 0.1E+00 115 61 27
56 0.2 0 16.1 0.2E+00 92 29 27
57 0.2 0 16.5 0.2E-01 815 124 27
58 0.2 0 17.0 0.7E-01 241 49 34
59 0.2 0 18.0 0.1E+00 130 64 22
60 0.2 0 17.6 0.9E-01 192 38 33
61 0.2 0 17.6 0.5E-01 339 59 33
62 0.2 0 18.2 0.6E-02 3142 75 51
63 0.2 0 18.2 0.3E-01 695 101 69
64 0.2 0 17.1 0.6E-01 304 34 21
65 0.2 0 17.8 0.5E-01 389 51 44
66 0.3 0 18.3 0.3E+00 62 54 30
67 0.3 0 17.6 0.4E-01 402 52 37
68 0.3 0 18.6 0.1E+00 155 45 37
69 0.3 0 18.1 0.1E+00 142 68 46
70 0.3 0 19.0 0.1E+00 159 67 24
71 0.3 0 18.7 0.5E-01 372 87 47
72 0.3 0 18.7 0.7E-01 267 47 27
73 0.3 0 19.0 0.6E-01 307 64 41
74 0.4 0 19.3 0.5E-01 375 55 30
75 0.4 0 19.8 0.6E-01 331 44 30
76 0.4 0 19.3 0.2E-01 969 67 43
77 0.4 0 19.0 0.1E-01 1437 73 44
78 0.4 0 20.1 0.6E-01 314 76 42
79 0.4 0 20.2 0.2E+00 90 64 25
80 0.4 0 18.9 0.1E+00 138 122 47
81 0.4 0 21.1 0.6E-03 36617 77 48
82 0.5 0 20.3 0.8E-01 259 60 46
83 0.4 0 20.1 0.1E-01 1367 136 48
84 0.5 0 20.6 0.5E-01 457 41 31
85 0.5 0 20.4 0.2E-01 1273 111 44
86 0.5 0 21.2 0.6E-01 367 117 58
87 0.5 0 20.5 0.4E-01 498 167 48
88 0.5 0 20.3 0.8E-01 266 113 65
89 0.5 0 22.0 0.3E-01 662 96 51
90 0.6 0 21.2 0.9E-01 240 90 49
91 0.6 0 21.2 0.7E-01 325 97 53
92 0.6 0 21.3 0.2E+00 116 61 58
93 0.6 0 21.5 0.2E-01 1080 182 44
94 0.6 0 22.0 0.3E-01 804 231 39
95 0.7 0 21.2 0.1E+00 188 66 54
96 0.7 0 21.6 0.1E+00 172 76 47
97 0.7 0 22.7 0.4E-01 554 78 58
98 0.7 0 22.1 0.5E-01 469 142 59
99 0.7 0 22.5 0.4E-01 586 121 73
100 0.7 0 22.9 0.2E-01 1257 54 40
101 0.8 0 23.9 0.2E+00 130 91 57
102 0.8 0 22.9 0.4E-01 594 58 35
103 0.8 0 22.7 0.2E+00 139 58 38
104 0.8 0 22.2 0.3E-01 752 121 52
105 0.8 0 22.6 0.1E+00 193 112 85
106 0.9 0 23.7 0.2E-01 1140 113 83
107 0.9 0 22.9 0.8E-01 292 71 54
108 0.9 0 23.0 0.1E-01 1917 172 51
109 0.9 0 23.2 0.2E+00 121 108 77
110 1.0 0 23.2 0.1E+00 232 112 65
111 1.0 0 24.1 0.1E+00 184 107 64
112 1.0 0 22.6 0.4E-02 5336 55 47
113 1.0 0 24.4 0.1E-01 1715 63 52
114 1.0 0 23.9 0.7E-02 3605 147 60
115 1.1 0 24.0 0.7E-02 3365 159 59
116 1.1 0 23.9 0.5E-01 489 218 64
117 1.2 0 24.4 0.6E-01 380 213 112
118 1.3 0 24.4 0.2E-02 10670 156 69
119 1.2 0 24.4 0.8E-02 3092 110 63
120 1.3 0 23.7 0.9E-01 251 161 80
121 1.3 0 24.4 0.6E-02 3844 139 43
122 1.3 0 25.3 0.3E-02 8732 129 59
123 1.3 0 24.9 0.1E+00 184 72 52
124 1.4 0 24.6 0.7E-03 36809 147 55
125 1.4 0 25.3 0.6E-01 440 147 131
126 1.5 0 25.0 0.1E+00 211 162 79
127 1.4 0 25.1 0.1E+00 244 132 56
128 1.4 0 25.3 0.4E-06 62712320 913 260
129 1.5 0 25.8 0.1E+00 220 157 76
130 1.5 0 26.1 0.9E-01 279 121 75
131 1.5 0 25.9 0.9E-01 281 121 57
132 1.6 0 25.5 0.7E-01 342 95 66
133 1.6 0 26.0 0.9E-01 279 114 62
134 1.7 0 25.8 0.8E-01 308 145 57
135 1.7 0 26.0 0.2E-01 1053 80 59
136 1.7 0 25.7 0.4E-02 5720 709 111
137 1.9 0 26.1 0.8E-01 330 86 59
138 1.8 0 26.3 0.5E-01 480 79 50
139 1.8 0 26.6 0.2E+00 171 73 61
140 1.9 0 26.0 0.2E+00 146 79 58
141 1.9 0 26.5 0.9E-01 307 130 85
142 1.9 0 27.0 0.2E-01 1737 426 69
143 2.0 0 26.9 0.5E-01 587 134 64
144 2.0 0 26.1 0.1E+00 190 108 69
145 2.1 0 27.5 0.1E-01 2259 143 56
146 2.1 0 27.5 0.1E-01 2173 197 90
147 2.1 0 27.7 0.5E-01 609 135 99
148 2.4 0 27.4 0.2E-01 1124 138 95
149 2.4 0 27.9 0.4E-01 717 282 92
150 2.6 0 27.2 0.1E+00 267 115 99
151 2.6 0 27.6 0.7E-01 395 169 85
152 2.6 0 26.7 0.8E-01 319 143 62
153 2.5 0 28.1 0.3E-01 813 134 66
154 2.5 0 28.0 0.9E-01 310 158 143
155 2.5 0 27.8 0.9E-01 319 206 115
156 2.6 0 27.7 0.6E-01 433 90 60
157 2.6 0 28.1 0.4E-01 709 90 71
158 2.6 0 28.3 0.1E-01 2144 85 68
159 2.7 0 28.8 0.2E+00 189 111 83
160 2.8 0 29.1 0.6E-01 510 200 94
161 2.8 0 29.2 0.2E-01 1391 248 72
162 2.8 0 28.3 0.1E+00 222 167 95
163 2.9 0 28.7 0.1E+00 238 187 54
164 3.0 0 28.6 0.8E-01 375 120 55
165 3.0 0 28.7 0.1E+00 302 288 66
166 3.0 0 29.2 0.4E-01 669 185 150
167 3.1 0 28.8 0.9E-01 324 186 105
168 3.1 0 28.5 0.7E-03 38553 153 96
169 3.2 0 29.1 0.2E+00 180 99 82
170 3.3 0 29.7 0.5E-01 596 130 67
171 3.4 0 29.9 0.5E-01 552 216 134
172 3.4 0 29.1 0.6E-02 5059 118 104
173 3.5 0 29.3 0.5E-01 628 128 77
174 3.6 0 29.6 0.3E-01 1021 148 92
175 3.6 0 29.9 0.8E-01 382 263 96
176 3.8 0 28.5 0.2E+00 160 123 84
177 3.7 0 29.7 0.2E-01 1396 216 76
178 3.8 0 30.0 0.8E-01 384 128 68
179 3.9 0 30.9 0.1E+00 208 180 96
180 4.0 0 29.1 0.5E-01 563 110 94
181 4.0 0 30.9 0.3E-01 1066 272 87
182 4.0 0 30.0 0.1E-01 2543 199 100
183 4.1 0 30.0 0.1E+00 300 201 97
184 4.2 0 29.2 0.1E+00 274 124 104
185 4.2 0 30.2 0.1E-01 2859 396 154
186 4.3 0 30.6 0.4E-01 687 365 129
187 4.4 0 31.1 0.7E-01 468 299 176
188 4.4 0 30.9 0.6E-01 548 123 91
189 4.5 0 30.8 0.4E-01 746 184 89
190 4.6 0 31.1 0.1E+00 247 129 93
191 4.6 0 31.3 0.1E-01 2230 134 100
192 4.7 0 34.3 0.2E-01 1746 277 117
193 4.7 0 31.2 0.5E-01 580 190 80
194 4.8 0 31.8 0.8E-03 39441 303 162
195 4.9 0 31.1 0.8E-01 384 293 99
196 5.0 0 31.2 0.1E-01 2643 150 97
197 5.1 0 31.2 0.9E-01 355 138 107
198 5.2 0 31.8 0.9E-02 3564 216 93
199 5.2 0 31.7 0.7E-01 464 150 117
200 5.3 0 30.4 0.2E-01 1382 390 109
201 5.4 0 32.1 0.4E-01 902 213 100
202 5.5 0 32.3 0.3E-01 1233 190 98
203 5.5 0 31.7 0.1E+00 302 159 122
204 5.6 0 31.9 0.2E-01 1346 104 85
205 5.7 0 32.1 0.2E+00 191 128 99
206 5.7 0 32.2 0.6E-03 57730 155 104
207 5.9 0 32.3 0.4E-01 913 318 162
208 6.0 0 30.6 0.6E-02 5264 245 102
209 6.0 0 32.5 0.7E-01 434 214 192
210 6.1 0 32.3 0.5E-01 672 197 132
211 6.2 0 32.5 0.1E+00 250 150 79
212 6.2 0 32.6 0.4E-01 883 456 122
213 6.4 0 33.4 0.1E-01 2995 518 141
214 6.4 0 33.1 0.8E-02 4069 298 204
215 6.5 0 32.5 0.7E-01 449 374 112
216 6.6 0 31.8 0.2E-01 1812 161 73
217 6.7 0 33.4 0.4E-01 856 239 78
218 6.8 0 33.2 0.9E-01 376 215 82
219 6.9 0 33.2 0.7E-01 445 216 163
220 7.4 0 32.6 0.6E-01 557 188 114
221 7.3 0 33.6 0.5E-01 714 346 185
222 7.2 0 33.0 0.2E-01 1665 392 153
223 7.2 0 33.4 0.4E-02 7875 281 140
224 7.4 0 33.7 0.6E-01 553 285 159
225 7.6 0 33.2 0.4E-01 767 157 124
226 7.6 0 34.0 0.3E-01 1117 137 125
227 8.1 0 34.0 0.2E-01 2167 421 264
228 8.2 0 33.4 0.3E-01 1327 470 109
229 8.4 0 34.3 0.5E-01 691 337 131
230 8.5 0 33.0 0.1E+00 333 138 96
231 8.5 0 34.6 0.8E-01 455 233 143
232 8.6 0 33.2 0.2E-01 2138 557 156
233 8.6 0 34.0 0.5E-01 681 285 145
234 8.7 0 34.3 0.1E-01 2375 217 152
235 8.7 0 35.1 0.5E-01 658 243 151
236 8.8 0 34.3 0.4E-01 843 419 168
237 8.7 0 34.8 0.3E-01 1075 182 100
238 8.8 0 35.3 0.1E+00 308 185 108
239 9.0 0 34.8 0.2E-02 18905 130 113
240 9.0 0 32.4 0.7E-01 487 160 116
241 9.1 0 34.4 0.3E-01 1034 271 92
242 9.2 0 34.7 0.1E-01 3626 278 119
243 9.3 0 34.8 0.2E-01 1825 229 146
244 9.4 0 34.9 0.7E-01 493 195 150
245 9.5 0 35.5 0.1E-01 3237 134 96
246 9.8 0 35.1 0.1E-01 2454 216 154
247 9.8 0 35.4 0.8E-01 455 394 133
248 10.0 0 33.8 0.7E-01 473 193 99
249 10.0 0 36.3 0.5E-01 762 353 142
250 10.1 0 35.7 0.6E-01 592 412 195
251 10.3 0 35.8 0.2E+00 237 169 130
252 10.4 0 34.9 0.1E+00 348 237 115
253 10.5 0 36.9 0.6E-02 6626 439 172
254 10.6 0 35.5 0.1E-01 3611 281 140
255 10.7 0 36.1 0.1E+00 378 225 118
256 10.7 0 59.1 0.1E-20 2147483647 2147483647 2147483647
257 11.0 0 36.5 0.1E+00 255 208 144
258 11.1 0 35.8 0.2E-01 1544 277 96
259 11.3 0 36.0 0.2E-01 1443 288 205
260 11.3 0 35.8 0.4E-01 818 324 127
261 11.5 0 35.9 0.1E+00 261 218 115
262 11.6 0 36.2 0.1E+00 298 165 138
263 11.8 0 37.5 0.4E-01 888 352 238
264 11.8 0 34.5 0.9E-01 369 170 128
265 12.0 0 37.0 0.4E-02 9510 285 155
266 12.2 0 36.2 0.3E-03 129540 276 118
267 12.3 0 38.8 0.1E-01 3443 291 173
268 12.4 0 36.3 0.7E-01 538 197 120
269 12.5 0 36.8 0.4E-01 861 199 152
270 13.4 0 37.2 0.5E-01 748 226 181
271 13.3 0 36.9 0.3E-01 1353 570 134
272 13.9 0 35.4 0.4E-01 995 160 112
273 13.8 0 37.1 0.2E-01 1872 258 111
274 13.8 0 37.6 0.2E-01 1609 338 115
275 13.4 0 37.7 0.1E-02 26836 330 226
276 13.5 0 37.1 0.6E-01 634 221 146
277 13.6 0 38.7 0.5E-01 764 180 124
278 14.4 0 38.1 0.1E+00 369 322 163
279 14.1 0 37.1 0.4E-01 929 287 149
280 14.6 0 35.8 0.8E-01 476 127 121
281 14.3 0 37.8 0.9E-03 42590 173 153
282 14.2 0 37.0 0.5E-01 720 236 167
283 14.4 0 37.7 0.2E-01 1686 266 152
284 14.6 0 38.2 0.4E-01 955 172 107
285 15.0 0 37.9 0.2E-02 24267 246 127
286 14.8 0 37.7 0.2E-01 2501 288 168
287 15.0 0 38.4 0.5E-01 700 287 189
288 15.1 0 41.6 0.5E-01 832 206 177
289 15.3 0 38.4 0.4E-01 917 384 159
290 15.5 0 38.6 0.5E-01 705 459 179
291 15.6 0 38.5 0.2E-01 2315 214 135
292 15.8 0 38.5 0.3E-01 1155 201 141
293 15.9 0 38.2 0.4E-01 1086 474 92
294 16.1 0 38.5 0.3E-01 1380 378 255
295 16.2 0 38.8 0.6E-02 6553 447 178
296 16.4 0 36.8 0.1E-01 3100 343 196
297 16.6 0 38.8 0.3E-01 1211 174 136
298 16.7 0 38.4 0.2E-01 1981 437 219
299 16.9 0 39.4 0.7E-01 528 248 219
300 17.1 0 38.4 0.4E-01 899 256 154
310 18.7 0 39.1 0.6E-01 633 315 155
320 20.0 0 54.1 0.5E-07 995625728 314481184 136633984
330 22.4 0 40.5 0.4E-01 1062 481 210
340 24.5 0 41.2 0.1E+00 322 237 151
350 26.6 0 41.7 0.8E-01 497 250 208
360 28.9 0 40.1 0.2E-01 1727 654 182
370 31.2 0 43.5 0.2E-01 2518 263 210
380 33.8 0 43.0 0.2E-01 1906 1525 270
390 36.4 0 43.7 0.8E-01 565 431 221
400 39.2 0 44.2 0.1E-01 3794 443 193
410 42.1 0 44.6 0.9E-02 4826 370 269
420 45.1 0 45.0 0.1E-01 3303 557 254
430 48.3 0 45.7 0.1E-01 3997 412 268
440 51.7 0 44.2 0.5E-01 886 359 290
450 55.1 0 46.9 0.3E-01 1589 305 218
460 59.9 0 46.9 0.1E+00 318 239 175
470 63.8 0 47.7 0.1E+00 449 315 189
480 67.3 0 61.8 0.5E-01 1277 777 445
490 71.4 0 48.4 0.2E-01 2164 292 249
500 75.3 0 49.4 0.1E+00 462 299 190
550 100.2 0 52.7 0.2E-01 2637 498 208
600 129.1 0 53.4 0.6E-01 839 521 401
650 162.1 0 56.3 0.2E-01 2920 1342 284
700 200.7 0 56.7 0.4E-01 1430 470 322
750 245.9 0 60.2 0.3E-01 2399 591 368
800 292.5 0 97.5 0.1E-07 2147483647 1187167360 666092416
850 358.1 0 63.3 0.5E-01 1300 645 265
900 421.6 0 63.2 0.1E+00 629 541 330
950 495.3 0 67.1 0.5E-02 12392 1252 469
1000 575.9 0 70.5 0.2E-01 3624 2427 664
From 6 January 1989:
It's not often that I apply what I learned at Berkeley to my daily work,
which primarily involves finding very low-level bugs in hardware and software,
mostly under development by other people.
Since I have to test hardware with a wide performance range, benchmarks
have to be adjustable in size so that they don't run too quickly on fast
hardware to be timed accurately, nor too slowly on slow hardware to finish
before that hardware is obsolete.
So I have added adjustable parameters to a number of benchmarks. To
calibrate them I run a little measurement program derived from the infamous
Linpack benchmark. (When I first came to Sun I was so ignorant that I thought
Linpack was a library of linear algebra subroutines rather than a benchmark
program.)
This little Linpack starts off factoring a 32x32 matrix. Even a Sun-2
can do that in acceptable time. If the time is too fast, it then automatically
tries a larger matrix, up to 512x512. Then it computes what the execution
time would have been for a 512x512 on the particular system, scaling the
time for whatever matrix it settled on by (512/n)**3.
Some Sun hardware currently under development has been getting fast
enough that the calibration program tries 512x512. On both projects in ques-
tion, however, I noticed that the program was getting hung up and taking an
inordinate amount of time to finish the calibration program. This of course
indicated a hardware bug, of which I informed the hardware guys and left them
to find it. Oddly enough, the bug only affected IEEE single precision; double
precision was fine; most of our bugs tend to occur in double precision for
various reasons. Furthermore the bug didn't seem to show up in the 100x100,
300x300, or 1000x1000 single precision Linpack benchmarks that I also run.
In the first project the microcoder dutifully tracked things down with a
logic analyzer to where an underflow was occurring. Now I knew that the Lin-
pack benchmark generates uniform random data over an innocuous interval, and
we all know that matrices composed of random data from a uniform distribution
are remarkably well conditioned. I remember this particularly well because
one of my Berkeley qualifying examination questions - put to me a week ahead
of time to ponder fruitlessly - was to come up with a plausible explanation of
how an eminent mathematician had conducted an empirical investigation of
iterative improvement, studying published test matrices and matrices of
uniformly-distributed random data, and had come to the conclusion that itera-
tive improvement never improved the answer by more than one or two digits,
which seemed to argue against what's routinely taught in elementary numerical
analysis.
Anyway I told the microcoder that there was still a hardware bug, but
shortly thereafter something else changed and the calibration benchmark went
back to 256x256 and there never was any other evidence of a single precision
problem, so I quit worrying about it. Underflows, if they could have
occurred, would account for the program apparently hanging up, because the way
Sun obtains IEEE conformance on underflow in systems built upon hardware like
the Weitek 1164/5, is to trap on underflow and recompute the correct result
very slowly in software. Such underflows are supposed to be rare.
Then a totally unrelated but even more critical project ran into a simi-
lar problem. This afternoon I went to a high level crisis meeting attended by
at least two vice presidents, at which I gloomily reported that a new, previ-
ously unknown hardware bug had appeared that affected single precision.
Upon returning from that meeting one of my colleagues, who'd been helping
at the logic analyzer until he got suspicious, informed me that he thought
there was no hardware or compiler bug, rather that the program had started to
underflow because the pivots had gotten too small and fallen off the end of
single precision. What about the well-known wonderful condition of matrices
of uniform random data, I asked? He suspected that the high dimensionality
(512x512) was just too much for single precision. I wondered why the
1000x1000 benchmark worked in single precision.
Since no progress was being made on the hardware bug, I started printing
out the pivots in the program. They started out as normal numbers like 1 or
-10, then suddenly dropped to about 1e-7, then later to 1e-14, and then:
k 82 pivot -1.8666e-20
k 83 pivot -2.96595e-14
k 84 pivot 2.46156e-14
k 85 pivot 2.40541e-14
k 86 pivot -4.99053e-14
k 87 pivot 1.7579e-14
k 88 pivot 1.69295e-14
k 89 pivot -1.56396e-14
k 90 pivot 1.37869e-14
k 91 pivot -3.10221e-14
k 92 pivot 2.35206e-14
k 93 pivot 1.32175e-14
k 94 pivot -7.77593e-15
k 95 pivot 1.34815e-14
k 96 pivot -1.02589e-21
k 97 pivot 4.27131e-22
k 98 pivot 1.22101e-21
k 99 pivot -7.12407e-22
k 100 pivot -1.75579e-21
k 101 pivot 3.13343e-21
k 102 pivot -6.99946e-22
k 103 pivot 3.82048e-22
k 104 pivot 8.05538e-22
k 105 pivot -1.18164e-21
k 106 pivot -6.349e-22
k 107 pivot -2.48245e-21
k 108 pivot -8.89452e-22
k 109 pivot -8.23235e-22
k 110 pivot 4.40549e-21
k 111 pivot 1.12387e-21
k 112 pivot -4.78853e-22
k 113 pivot 4.38739e-22
k 114 pivot 7.3868e-28
SIGFPE 8: numerical exception, CHK, or TRAP
stopped at daxpy+0x18c: movl a4a(0xe10),a3a
-
Those sudden drops were certainly perplexing - almost full word width, as if
the matrix were actually exactly singular. Could there be a bug in the matrix
generator routine (matgen) causing some wild data to be thrown into the pot?
I looked and found a perfectly conventional portable linear congruential ran-
dom number generator...
More information about the Numeric-interest
mailing list