Go to the first, previous, next, last section, table of contents.
This chapter describes some tips and tricks for debugging numerical programs which use GSL.
Any errors reported by the library are passed to the function
gsl_error
. By running your programs under gdb and setting a
breakpoint in this function you can automatically catch any library
errors. You can add a breakpoint for every session by putting
break gsl_error
into your '.gdbinit' file in the directory where your program is started.
If the breakpoint catches an error then you can use a backtrace
(bt
) to see the call-tree, and the arguments which possibly
caused the error. By moving up into the calling function you can
investigate the values of variables at that point. Here is an example
from the program fft/test_trap
, which contains the following
line,
status = gsl_fft_complex_wavetable_alloc (0, &complex_wavetable);
The function gsl_fft_complex_wavetable_alloc
takes the length of
an FFT as its first argument. When this line is executed an error will
be generated because the length of an FFT is not allowed to be zero.
To debug this problem we start gdb
, using the file
'.gdbinit' to define a breakpoint in gsl_error
,
bash$ gdb test_trap GDB is free software and you are welcome to distribute copies of it under certain conditions; type "show copying" to see the conditions. There is absolutely no warranty for GDB; type "show warranty" for details. GDB 4.16 (i586-debian-linux), Copyright 1996 Free Software Foundation, Inc. Breakpoint 1 at 0x8050b1e: file error.c, line 14.
When we run the program this breakpoint catches the error and shows the reason for it.
(gdb) run Starting program: test_trap Breakpoint 1, gsl_error (reason=0x8052b0d "length n must be positive integer", file=0x8052b04 "c_init.c", line=108, gsl_errno=1) at error.c:14 14 if (gsl_error_handler)
The first argument of gsl_error
is always a string describing the
error. Now we can look at the backtrace to see what caused the problem,
(gdb) bt #0 gsl_error (reason=0x8052b0d "length n must be positive integer", file=0x8052b04 "c_init.c", line=108, gsl_errno=1) at error.c:14 #1 0x8049376 in gsl_fft_complex_wavetable_alloc (n=0, wavetable=0xbffff778) at c_init.c:108 #2 0x8048a00 in main (argc=1, argv=0xbffff9bc) at test_trap.c:94 #3 0x80488be in ___crt_dummy__ ()
We can see that the error was generated in the function
gsl_fft_complex_wavetable_alloc
when it was called with an
argument of n=0. The original call came from line 94 in the
file 'test_trap.c'.
By moving up to the level of the original call we can find the line that caused the error,
(gdb) up #1 0x8049376 in gsl_fft_complex_wavetable_alloc (n=0, wavetable=0xbffff778) at c_init.c:108 108 GSL_ERROR ("length n must be positive integer", GSL_EDOM); (gdb) up #2 0x8048a00 in main (argc=1, argv=0xbffff9bc) at test_trap.c:94 94 status = gsl_fft_complex_wavetable_alloc (0, &complex_wavetable);
Thus we have found the line that caused the problem. From this point we
could also print out the values of other variables such as
complex_wavetable
.
The contents of floating point registers can be examined using the
command info float
(on supported platforms).
(gdb) info float st0: 0xc4018b895aa17a945000 Valid Normal -7.838871e+308 st1: 0x3ff9ea3f50e4d7275000 Valid Normal 0.0285946 st2: 0x3fe790c64ce27dad4800 Valid Normal 6.7415931e-08 st3: 0x3ffaa3ef0df6607d7800 Spec Normal 0.0400229 st4: 0x3c028000000000000000 Valid Normal 4.4501477e-308 st5: 0x3ffef5412c22219d9000 Zero Normal 0.9580257 st6: 0x3fff8000000000000000 Valid Normal 1 st7: 0xc4028b65a1f6d243c800 Valid Normal -1.566206e+309 fctrl: 0x0272 53 bit; NEAR; mask DENOR UNDER LOS; fstat: 0xb9ba flags 0001; top 7; excep DENOR OVERF UNDER LOS ftag: 0x3fff fip: 0x08048b5c fcs: 0x051a0023 fopoff: 0x08086820 fopsel: 0x002b
Individual registers can be examined using the variables $reg, where reg is the register name.
(gdb) p $st1 $1 = 0.02859464454261210347719
It is possible to stop the program whenever a SIGFPE
floating
point exception occurs. This can be useful for finding the cause of an
unexpected infinity or NaN
. The current handler settings can be
shown with the command info signal SIGFPE
.
(gdb) info signal SIGFPE Signal Stop Print Pass to program Description SIGFPE Yes Yes Yes Arithmetic exception
Unless the program uses a signal handler the default setting should be
changed so that SIGFPE is not passed to the program, as this would cause
it to exit. The command handle SIGFPE stop nopass
prevents this.
(gdb) handle SIGFPE stop nopass Signal Stop Print Pass to program Description SIGFPE Yes Yes No Arithmetic exception
Depending on the platform it may be necessary to instruct the kernel to
generate signals for floating point exceptions. For programs using GSL
this can be achieved using the GSL_IEEE_MODE
environment variable
in conjunction with the function gsl_ieee_env_setup()
as described
in see section IEEE floating-point arithmetic.
(gdb) set env GSL_IEEE_MODE=double-precision
Writing reliable numerical programs in C requires great care. The following GCC warning options are recommended when compiling numerical programs:
gcc -ansi -pedantic -Werror -Wall -W -Wmissing-prototypes -Wstrict-prototypes -Wtraditional -Wconversion -Wshadow -Wpointer-arith -Wcast-qual -Wcast-align -Wwrite-strings -Wnested-externs -fshort-enums -fno-common -Dinline= -g -O4
For details of each option consult the manual Using and Porting GCC. The following table gives a brief explanation of what types of errors these options catch.
-ansi -pedantic
-Werror
-Wall
-Wall
, but it is not enough on its own.
-O4
-Wall
rely on the optimizer to analyze the code. If there is no
optimization then these warnings aren't generated.
-W
-Wall
, such as
missing return values and comparisons between signed and unsigned
integers.
-Wmissing-prototypes -Wstrict-prototypes
-Wtraditional
-Wconversion
unsigned int x = -1
. If you need
to perform such a conversion you can use an explicit cast.
-Wshadow
-Wpointer-arith -Wcast-qual -Wcast-align
void
, if you remove a const
cast from a pointer, or if you cast a pointer to a type which has a
different size, causing an invalid alignment.
-Wwrite-strings
const
qualifier so that it
will be a compile-time error to attempt to overwrite them.
-fshort-enums
enum
as short as possible. Normally
this makes an enum
different from an int
. Consequently any
attempts to assign a pointer-to-int to a pointer-to-enum will generate a
cast-alignment warning.
-fno-common
extern
declaration.
-Wnested-externs
extern
declaration is encountered within a
function.
-Dinline=
inline
keyword is not part of ANSI C. Thus if you want to use
-ansi
with a program which uses inline functions you can use this
preprocessor definition to remove the inline
keywords.
-g
gdb
. The only effect of debugging symbols
is to increase the size of the file, and you can use the strip
command to remove them later if necessary.
The following books are essential reading for anyone writing and debugging numerical programs with GCC and GDB.
Go to the first, previous, next, last section, table of contents.