[ale] gcc optimization problem
Joe Steele
joe at madewell.com
Tue Mar 5 14:59:16 EST 2002
For what it's worth, -O2 compilation of your code worked for me, but
I used gcc version 2.95.3. The disassembled object code (included
below) looks to do what it should. I presume gcc-2.96.98 is
generating different code for you.
Incidentally, shouldn't your function handle the special case of 0.0
on input? I believe it presently translates 0.0 to 1.5e-39
(= 2^-129).
--Joe
readFaxFloat.o: file format elf32-i386
Disassembly of section .text:
00000000 <readVaxFloat>:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 83 ec 10 sub $0x10,%esp
6: 56 push %esi
7: 53 push %ebx
8: 8b 5d 08 mov 0x8(%ebp),%ebx
b: 0f b7 0b movzwl (%ebx),%ecx
e: 89 c8 mov %ecx,%eax
10: 25 80 7f 00 00 and $0x7f80,%eax
15: 89 ca mov %ecx,%edx
17: 66 c1 e8 03 shr $0x3,%ax
1b: 05 e0 37 00 00 add $0x37e0,%eax
20: 81 e2 00 80 ff ff and $0xffff8000,%edx
26: 09 c2 or %eax,%edx
28: 89 c8 mov %ecx,%eax
2a: 83 e0 7f and $0x7f,%eax
2d: 8d 75 f8 lea 0xfffffff8(%ebp),%esi
30: 66 c1 e8 03 shr $0x3,%ax
34: 09 c2 or %eax,%edx
36: 66 89 56 06 mov %dx,0x6(%esi)
3a: 0f b7 53 02 movzwl 0x2(%ebx),%edx
3e: 89 d0 mov %edx,%eax
40: 66 c1 e8 03 shr $0x3,%ax
44: c1 e1 0d shl $0xd,%ecx
47: 09 c1 or %eax,%ecx
49: 66 89 4e 04 mov %cx,0x4(%esi)
4d: 0f b7 4b 04 movzwl 0x4(%ebx),%ecx
51: 89 c8 mov %ecx,%eax
53: 66 c1 e8 03 shr $0x3,%ax
57: c1 e2 0d shl $0xd,%edx
5a: 09 c2 or %eax,%edx
5c: 66 89 56 02 mov %dx,0x2(%esi)
60: 0f b7 43 06 movzwl 0x6(%ebx),%eax
64: c1 e1 0d shl $0xd,%ecx
67: 66 c1 e8 03 shr $0x3,%ax
6b: 09 c1 or %eax,%ecx
6d: 66 89 4d f8 mov %cx,0xfffffff8(%ebp)
71: 5b pop %ebx
72: dd 45 f8 fldl 0xfffffff8(%ebp)
75: 5e pop %esi
76: c9 leave
77: c3 ret
-----Original Message-----
From: D. Alan Stewart [SMTP:astewart at layton-graphics.com]
Sent: Monday, March 04, 2002 5:57 PM
To: ale at ale.org
Subject: [ale] gcc optimization problem
For some reason when I compile with the -O2 option, this function always
returns 0.0, unless I insert some printf's, in which case it behaves normally.
Does anyone have enough experience with gcc optimizations to guess as to
why? (The function bit twiddles a VAX D floating point number into Intel IEEE
double precision format.)
Float64 readVaxFloat(Float64 *input)
{
Float64 output;
Uint16 *in = (Uint16*) input;
Uint16 *out = (Uint16*) &output;
out[3] = in[0] & 0x8000;
out[3] |= (((in[0] & 0x7F80) >> 7) + 894) << 4;
out[3] |= (in[0] & 0x007F) >> 3;
out[2] = (in[0] << 13) | (in[1] >> 3);
out[1] = (in[1] << 13) | (in[2] >> 3);
out[0] = (in[2] << 13) | (in[3] >> 3);
return output;
}
---
This message has been sent through the ALE general discussion list.
See http://www.ale.org/mailing-lists.shtml for more info. Problems should be
sent to listmaster at ale dot org.
More information about the Ale
mailing list