[Cfp-interest] exceptions and flags

Fri Jun 6 20:25:00 PDT 2014

1.  There is a FEX (Floating-Point Enabled Exception Summary) status bit in
the PowerPC FPSCR (Floating-Point Status and Control Register) that is the
OR of all enabled exception status bits.
2.  Every floating-point instruction has a control bit that slightly
changes its name and semantics.  For example, the "fadd" (Floating-Point
Add Double Precision) instruction has a corresponding "fadd." (with a
period on the end) (Floating-Point Add Double Precision) instruction that
does the same thing as fadd and also copies a 4 bit portion of the FPSCR to
Condition Register Field 1 (cr1).  The FEX bit is the second bit of that.
3.  A standard branch on true or branch on false instruction can test any
bit of the 32 bit Condition Register and branch or not.
4.  When precisely detecting floating-point exceptions is enabled, the
compiler generates the "." version of all floating-point instructions, and
also generates the "." version of a floating-point register copy
instruction after each call.  That ensures that FEX is copied to cr1 after
every operation and every call.
5.  When imprecise detection is enabled for smaller faster code, the normal
floating-point instructions are generated, but the "." version of a
floating-point register copy instruction is generated before the function
returns.  That ensures that FEX is copied to cr1 before the return, instead
of after every floating-point operation.
6.  In both precise (#4) and imprecise (#5) cases, the compiler generates a
branch on false around a trap instruction after each of the "."
instructions.
     If there was no enabled exception, the cr1 copy of the FEX bit is
false, the branch on false around the trap is taken, and the program
continues.
     If there was an enabled exception, the cr1 copy of the FEX bit is
true, the branch on false around the trap is not taken, and the trap is
executed.
     The conditional branch should be predicted as taken, so the next
instruction after the trap can be fetched and dispatched without waiting.
7.  If it is executed the trap causes a trap interrupt, and the operating
system activates the trap signal handler if there is one.

The code pattern for precise detection is
	add.   fprR=fprA,fprB
	bf        cr1,FEXbit,*+4
	trap

For catching an exception in an exception try block (or whatever equivalent
we choose), I expect #6 and #7 of the mechanism would be changed:
6.  The compiler would generate a branch on false around a block of
exception decoding code.
      Knowing that at least one enabled exception must have occurred if
it's reached, that code would copy exception status bits into one or more
condition register fields, then execute conditional branch on true
instructions testing the status bits of interest and if set branching to
the catch block corresponding to that status bit.  For example, if the
overflow and invalid operation exceptions were to be caught, one branch
would go to the overflow catch handler if the overflow status bit was set,
and a second branch would go to the invalid operation catch handler.
      The catch handler reached would handle that exception as requested.
If for example overflow occured, and the overflow catch handler said to
substitute a value (eg, the maximum finite value), the catch code would
have to put the new value into the register or variable the expression
result should be in, then continue execution.
7.  No trap would occur so the trap signal handler would not be involved.

The code pattern for precise detection would be:
	add.   fprR=fprA,fprB
	bt        cr1,FEXbit,catch_decode
	. . .
	sub.   fprX=fprY,fprZ
	bt        cr1,FEXbit,catch_decode
	. . . other operations . . .

  catchdecode:
	/* decode exceptions */
	copy fpscr overflow bit to crM
	bt       crM,OVERFLOWbit,catch_overflow
	copy fpscr invalid bit to crN
	bt       crN,INVALIDbit,catch_invalid
	. . .
catch_overflow:
	. . .
catch_invalid:
	. . .

Only tests for the enabled exceptions are needed, and the last can be
omitted and just fall through into that catch handler because it must be
the cause.
Instead of a branch on true to the catch handler, a branch on false around
it could be used.

The code pattern for imprecise detection would be slightly simpler:
	add    fprR=fprA,fprB
	sub    fprX=fprY,fprZ
	. . . other operations . . .

	/* At end of try block: */
	copy fpscr FEX bit to cr1
	bf       cr1,FEXbit,noexceptions

	/* decode exceptions */
	copy fpscr overflow bit to crM
	bt       crM,OVERFLOWbit,catch_overflow
	copy fpscr invalid bit to crN
	bt       crN,INVALIDbit,catch_invalid
	. . .
catch_overflow:
	. . .
catch_invalid:
	. . .

Advantages include that the system call to set up the trap handler is
avoided, and if an exception occurs the cost of activating the trap handler
and having it decode the cause is avoided.
Disadvantages include that with precise detection the program code would
run slower due to the branches.

- Ian McIntosh          IBM Canada Lab         Compiler Back End Support
and Development

|------------>
| From:      |
|------------>
  >-----------------------------------------------------------------------------------------------------------------------------------------|
  |David Hough CFP <pcfp at oakapple.net>                                                                                                      |
  >-----------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| To:        |
|------------>
  >-----------------------------------------------------------------------------------------------------------------------------------------|
  |Ian McIntosh/Toronto/IBM at IBMCA,                                                                                                          |
  >-----------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Date:      |
|------------>
  >-----------------------------------------------------------------------------------------------------------------------------------------|
  |2014-06-06 06:00 PM                                                                                                                      |
  >-----------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Subject:   |
|------------>
  >-----------------------------------------------------------------------------------------------------------------------------------------|
  |Re: [Cfp-interest] exceptions and flags                                                                                                  |
  >-----------------------------------------------------------------------------------------------------------------------------------------|

> the PowerPC architecture allows compilers to
detect any exceptions including underflow by using a slightly different
opcode then a conditional branch

So this is a branch on exception rather than branch on flag?
Does it refer to the most recent fp op code or is it some kind
of cumulative register like the flags, so the branch is taken if the
exception has arisen since the register was last reset?

I would suppose that, in analogy to comparison condition-codes, it
refers to the result of the last fp op code to execute.    If out of
order, does the conditional wait for all pending instructions to complete?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.oakapple.net/pipermail/cfp-interest/attachments/20140606/34f01a8b/attachment.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
Url : http://mailman.oakapple.net/pipermail/cfp-interest/attachments/20140606/34f01a8b/attachment.gif 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ecblank.gif
Type: image/gif
Size: 45 bytes
Desc: not available
Url : http://mailman.oakapple.net/pipermail/cfp-interest/attachments/20140606/34f01a8b/attachment-0001.gif