Notes from meeting 11 Jan 1991

Sun Jan 20 08:34:37 PST 1991

Dave Hough Writes:

	Three operand addition:
	Most people know by now that IBM RiscSystem/6000 has a multiply-add
	that incurs only one rounding error.  Perhaps inspired by that, some other
	people are looking at three-operand add with just one rounding error.
	This would have application to complex multiply-add
	and to fabricating doubled-precision arithmetic based on the paradigm

		z = x + y
		w = (z - x) - y

	Thinking about it for an architecture like SPARC with a maximum of three
	source/destination register specifiers, 
	an orthogonal set of such operations might include

		z = z + (x + y)
		z = z + (x - y)
		z = z - (x + y)
		z = (z + x) + y
		z = (z + x) - y
		z = (z - x) + y
		z = (z - x) - y

	That's beginning to add up to a lot of op codes, in addition to the
	interesting hardware question of how to efficiently implement a three-
	operand add.

+++++++++++++++++++++++++++++++++++++++++

Well, it may be a lot of opcodes if you are trying to add them to an
existing architecture. (The following is a little sketchy since it has
been about 2 years since I worked on the RS/6000 and I don't have any
detailed documentation at hand.) The RS/6000 however, only has one floating 
point operation and it's "multiply add". All the other operations are
simiply special types of multiply adds (Divide is the exception. Its 
implementation is akin to microcode using multiply-adds). So it may
only take 1 or 2 bits in the opcode to determine a fp operation. Then
there are opcode bits that allow any of the 3 input operands to be
negated. This gives the orthogonal set of multiply add operations above.
Subtract is just an add with the second operand negated.

Also, I believe some instruction bits indicate when to replace one of the
input operands with a "canned value". Thus an add  a + b becomes a multiply- 
add of: a * 1.0 + b.  And A * B becomes A * B + 0.0. Negate x becomes
-x * 1.0 + 0.0. Now I suppose you could call some of these bits the opcode
bits for add, subtract, negate, etc., but they are really more general and
are easy to decode and implement in hardware. So with this sort of scheme 
everything is a multiply-add and the instruction decode bits are used to 
differentiate negation of input operands and dummy input operands. (If I 
remember correctly there are some  minor problems that have to do 
with getting correctly signed zero results with this scheme. I think these 
were solved with extra logic of some sort.)

Therefore, I don't believe the RS/6000 uses any more bits for instruction
decode of floating point operations than does other existing architectures.

						-- joel boney