Data aborts

Here is a small example of how to debug a program that fails with a data abort (invalid memory reference). It's based on a question and example by John in this thread.

John posted the following data abort he got from the dropbear program:

 Data abort: pc=0x08246ce4
             lr=0x08243dfc
             sp=0x0237fc80
 Possible stack trace:
 01: 08243dfc
 02: 08249ce4
 03: 0824c0fc
 04: 0824bf00
 05: 0825a668
 06: 0825a640
 07: 082589b8
 08: 08262680
 09: 08262680
 10: 082561d8
 11: 08262688
 12: 0824008c
 13: 08240088
 14: 08262688

His question was:

 "Is there a dummies guide to getting something useful out of these numbers?"

There wasn't at the time of writing, so I made one:

Location of the crash

Let's consider this section first:

 Data abort: pc=0x08246ce4
             lr=0x08243dfc
             sp=0x0237fc80

The symbols mean:

How to make sense of the numbers

Since binaries run in DSLinux are relocated at run time, you need the offset the binary was loaded at in memory to make sense of any of these numbers.

Run this command on the cross-compilation host to enable flat binary format verbosity (a.k.a "kernel-traced load") for the given binary (in this case, dropbear):

 arm-linux-elf-flthdr -k dropbear

When you run that dropbear binary on the DS, it should now print something like this on the console:

 BINFMT_FLAT: Loading file /bin/dropbear
 Mapping is 21bee44, Entry point is 50, data_start is 2c300
 Load /bin/dropbear: TEXT=<font color="#FF0000">21bee84</font>-21eb144 DATA=810004-8185534 BSS=8185534-8188e4

The numbers are ripped off a traced load of busybox in this example, so they're not accurate for our dropbear example. Anyway, what you want to know is the number marked red (21bee84), which is the start of the program's text segment in memory. Substract that number from a symbol printed in the trace to map it to a numeric symbol in the binary.

The mapping of symbol names in code, such as function and variable names, and numeric symbols in the binary is referred to as a symbolmap. It can be obtained by running the symbolmap.sh script, which is part of the DSLinux toolchain, on the .gdb file of the binary. So for dropbear, we'd run:

 symbolmap.sh user/dropbear/dropbear.gdb

to obtain a symbol map for dropbear.

Interpreting the strack trace

This is the beginning of the stack trace in the example above:

 Possible stack trace:
 01: 08243dfc
 02: 08249ce4
 03: 0824c0fc
 ...

Note that this is a possible trace: it is very likely inaccurate. Lacking a better method to produce a trace, we simply walk up the stack and print everything that looks like it was an address in range of the text segment of the faulting binary. So the trace may contain garbage.

To get any idea at all of what the trace might actually be, substract the text segment offset obtained above from each address printed in the trace, and look up the result in the symbol map.

For example, if this was a snippet from the binary's symbol map:

 ...
 00019cc0 W strcoll
 00019ce0 T strcpy
 00019cfc T strdup
 00019d30 T strlen
 ...

and the result of the substraction was in range 19ce0 to 19cfb inclusive, then the function in question is strcpy. By the way, "T" in the symbol map means that the symbol is in the text segment of the binary. You can look up the meaning of the letters in the symbol map in the nm(1) man page.

It would nice if we had a script that automated this process a little...

Remote Debugging

If you want to debug your application running on the DS, you need an up-to-date toolchain with arm-linux-elf-gdb.

Telnet into the DS and type:

 $ gdbserver :4000 /usr/bin/rtest

The port number (4000) is arbitrary. Be sure to include the full path to your executable.

On the host, cd to the /user/rest directory and type:

 $ arm-linux-elf-gdb rtest.gdb
 $ target remote ds:4000
 $ break main
 $ continue

Where "ds" is the name or IP address of your DS.

If you want to have some comfort, use ddd:

 $ ddd --gdb --debugger arm-linux-elf-gdb rtest.gdb

DebuggingHowto (last edited 2008-01-22 14:02:37 by localhost)