May 12, 2014 - Problems Encoutered with the Optimizer
May 12, 2014
Problems Encoutered with the Optimizer
My current project is nearing completion. I have been testing it both with no
optimization, and also with a high level of optimization in GCC. I have also
tested a few other optimization flags at one time or another, but some do not
work because I don't feel like fixing the problems I encountered because I
don't care about those optimization levels enough. I've been learning a lot
about what the optimizer does, since each level of optimization creates its own
set of problems that I have to figure out. It does a lot of things I didn't
expect.
First of all, a lot of weird things happen in the math functions. I originally
implemented only sin, cos, and atan2 because those are the only functions I
call in my C code. This works fine with no optimization, but GCC's optimizer
decides to change these calls. In -O2 and -O3, GCC combines sin and cos calls
into one sincos call. This is great and all, except when I haven't implemented
sincos. However, sincos is about as easy as implementing sin and cos seperately
was (actually, easier). Then at -Ofast, GCC changes atan2 to __atan2_finite,
which I am not going to bother with right now. To me that seems a little funny,
GCC makes assumptions about your math library, sincos is not so bad because
that's kind of a standard function, but __atan2_finite not so much, I didn't do
the research, but that seems like a glibc specific function, so maybe -Ofast
could break builds with non-glibc math libraries.
However, the much more troubling issue occurred this afternoon. I was just
polishing up "everything mode" which gives you waves of different enemies each
time, when I noticed that asteroids mode was now causing a segfault within a
few seconds of starting. A weird regression since I hadn't changed that code
much for a long while.
So I ran it through a debugger, and this is what the debugger told me:
Program received signal SIGSEGV, Segmentation fault.
0x0000000000406ed1 in draw_circle (pixel=16777215, radius=20, cy=200, cx=500) at fb.c:264
264 if (cx-radius<=1 || cx+radius>=vinfo.xres)
Which, of course, made me really confused. I spent hours trying to figure it
out and some futile programming by permutation. It was so crazy that the circle
drawing code worked perfectly everywhere except the asteroids drawing code. At
one point I was getting ready to report the problem to the GCC maintainers as a
bug (but I didn't do it). Eventually I decided to take a look at the assembly
code in the debugger. This is where the segfault was occurring:
8206 movapd %xmm9, 32(%rsp)
Took some time for the answer to occur to me, but I did eventually figure it
out, and then I realized how deep this whole thing went. Turns out "movapd" is
one of them SSE2 instructions, one that fails if invoked with
non-16-bit-aligned memory. Somehow, every other time that function was called
happened to be with a properly aligned stack except when the asteroid wanted to
be drawn.
Now, of course this is my own fault for not aligning the stack properly, but it
points out the GCC optimizer makes more assumptions. I guess it's not really a
problem, the optimizer does have to make some assumptions, but it sure took me
a lot longer than it needed to to fix, especially considering that the fix is
one instruction in _start in crt.s. The given the GDB output, its really not
obvious that the actual problem is in the startup code and nowhere near the
actual segfault.
Oh well, gotta learn somehow, that's all for now. Align your stack!