[ale] gcc optimization problem

Tue Mar 5 14:59:16 EST 2002

For what it's worth, -O2 compilation of your code worked for me, but 
I used gcc version 2.95.3.  The disassembled object code (included 
below) looks to do what it should.  I presume gcc-2.96.98 is 
generating different code for you.

Incidentally, shouldn't your function handle the special case of 0.0 
on input?  I believe it presently translates 0.0 to 1.5e-39 
(= 2^-129).

--Joe

readFaxFloat.o:     file format elf32-i386

Disassembly of section .text:

00000000 <readVaxFloat>:
   0:   55                      push   %ebp
   1:   89 e5                   mov    %esp,%ebp
   3:   83 ec 10                sub    $0x10,%esp
   6:   56                      push   %esi
   7:   53                      push   %ebx
   8:   8b 5d 08                mov    0x8(%ebp),%ebx
   b:   0f b7 0b                movzwl (%ebx),%ecx
   e:   89 c8                   mov    %ecx,%eax
  10:   25 80 7f 00 00          and    $0x7f80,%eax
  15:   89 ca                   mov    %ecx,%edx
  17:   66 c1 e8 03             shr    $0x3,%ax
  1b:   05 e0 37 00 00          add    $0x37e0,%eax
  20:   81 e2 00 80 ff ff       and    $0xffff8000,%edx
  26:   09 c2                   or     %eax,%edx
  28:   89 c8                   mov    %ecx,%eax
  2a:   83 e0 7f                and    $0x7f,%eax
  2d:   8d 75 f8                lea    0xfffffff8(%ebp),%esi
  30:   66 c1 e8 03             shr    $0x3,%ax
  34:   09 c2                   or     %eax,%edx
  36:   66 89 56 06             mov    %dx,0x6(%esi)
  3a:   0f b7 53 02             movzwl 0x2(%ebx),%edx
  3e:   89 d0                   mov    %edx,%eax
  40:   66 c1 e8 03             shr    $0x3,%ax
  44:   c1 e1 0d                shl    $0xd,%ecx
  47:   09 c1                   or     %eax,%ecx
  49:   66 89 4e 04             mov    %cx,0x4(%esi)
  4d:   0f b7 4b 04             movzwl 0x4(%ebx),%ecx
  51:   89 c8                   mov    %ecx,%eax
  53:   66 c1 e8 03             shr    $0x3,%ax
  57:   c1 e2 0d                shl    $0xd,%edx
  5a:   09 c2                   or     %eax,%edx
  5c:   66 89 56 02             mov    %dx,0x2(%esi)
  60:   0f b7 43 06             movzwl 0x6(%ebx),%eax
  64:   c1 e1 0d                shl    $0xd,%ecx
  67:   66 c1 e8 03             shr    $0x3,%ax
  6b:   09 c1                   or     %eax,%ecx
  6d:   66 89 4d f8             mov    %cx,0xfffffff8(%ebp)
  71:   5b                      pop    %ebx
  72:   dd 45 f8                fldl   0xfffffff8(%ebp)
  75:   5e                      pop    %esi
  76:   c9                      leave  
  77:   c3                      ret    

-----Original Message-----
From:	D. Alan Stewart [SMTP:astewart at layton-graphics.com]
Sent:	Monday, March 04, 2002 5:57 PM
To:	ale at ale.org
Subject:	[ale] gcc optimization problem

For some reason when I compile with the -O2 option, this function always 
returns 0.0, unless I insert some printf's, in which case it behaves normally. 
Does anyone have enough experience with gcc optimizations to guess as to 
why? (The function bit twiddles a VAX D floating point number into Intel IEEE 
double precision format.)

Float64 readVaxFloat(Float64 *input)
{
  Float64 output;

  Uint16 *in = (Uint16*) input;
  Uint16 *out = (Uint16*) &output;

  out[3] = in[0] & 0x8000;
  out[3] |= (((in[0] & 0x7F80) >> 7) + 894) << 4;
  out[3] |= (in[0] & 0x007F) >> 3;
  out[2] = (in[0] << 13) | (in[1] >> 3);
  out[1] = (in[1] << 13) | (in[2] >> 3);
  out[0] = (in[2] << 13) | (in[3] >> 3);

  return output;
}

---
This message has been sent through the ALE general discussion list.
See http://www.ale.org/mailing-lists.shtml for more info. Problems should be 
sent to listmaster at ale dot org.