libsegfault.so

The dynamic linker [1] in a Linux system is using several environment variables to customize it's behavior. The most commonly used is probably LD_LIBRARY_PATH which is a list of directories where it search for libraries at execution time. Another variable I use quite often is LD_TRACE_LOADED_OBJECTS to let the program list its dynamic dependencies, just like ldd(1).

For example, consider the following output

$ LD_TRACE_LOADED_OBJECTS=1 /bin/bash
    linux-vdso.so.1 (0x00007ffece29e000)
    libreadline.so.7 => /usr/lib/libreadline.so.7 (0x00007fc9b82d1000)
    libdl.so.2 => /usr/lib/libdl.so.2 (0x00007fc9b80cd000)
    libc.so.6 => /usr/lib/libc.so.6 (0x00007fc9b7d15000)
    libncursesw.so.6 => /usr/lib/libncursesw.so.6 (0x00007fc9b7add000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fc9b851f000)
    libtinfo.so.6 => /usr/lib/libtinfo.so.6 (0x00007fc9b78b1000)

LD_PRELOAD

LD_PRELOAD is a list of additional shared objects that should be loaded before all other dynamic dependencies. When the loader is resolving symbols, it sequentially walk through the list of dynamic shared objects and takes the first match. This makes it possible to overide functions in other shared objects and change the behavior of the application completely.

Consider the following example

$ LD_PRELOAD=/usr/lib/libSegFault.so LD_TRACE_LOADED_OBJECTS=1 /bin/bash
    linux-vdso.so.1 (0x00007ffc73f61000)
    /usr/lib/libSegFault.so (0x00007f131c234000)
    libreadline.so.7 => /usr/lib/libreadline.so.7 (0x00007f131bfe6000)
    libdl.so.2 => /usr/lib/libdl.so.2 (0x00007f131bde2000)
    libc.so.6 => /usr/lib/libc.so.6 (0x00007f131ba2a000)
    libncursesw.so.6 => /usr/lib/libncursesw.so.6 (0x00007f131b7f2000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f131c439000)
    libtinfo.so.6 => /usr/lib/libtinfo.so.6 (0x00007f131b5c6000)

Here we have preloaded libSegFault and it's listed in second place. In the first place we have linux-vdso.so.1 which is a Virtual Dynamic Shared Object provided by the Linux kernel. The VDSO deserves it's own separate blog post, it's a cool feature that maps kernel code into the a process's context as a .text segment in a virtual library.

libSegFault.so

libSegFault.so is part of glibc [2] and comes with your toolchain. The library is for debugging purpose and is activated by preload it at runtime. It does not actually overrides functions but register signal handlers in a constructor (yes, you can execute code before main) for specified signals. By default only SIGEGV (see signal(7)) is registered. These registered handlers print a backtrace for the applicaton when the signal is delivered. See its implementation in [3].

Set the environment variable SEGFAULT_SIGNALS to explicit select signals you want to register a handler for.

/media/libsegfault.png

This is an useful feature for debug purpose. The best part is that you don't have to recompile your code.

libSegFault in action

Our application

Consider the following in real life application taken directly from the local nuclear power plant:

void handle_uranium(char *rod)
{
    *rod = 0xAB;
}

void start_reactor()
{
    char *rod = 0x00;
    handle_uranium(rod);
}

int main()
{
    start_reactor();
}

The symptom

We are seeing a segmentation fault when operate on a particular uranium rod, but we don't know why.

Use libSegFault

Start the application with libSegFault preloaded and examine the dump:

$ LD_PRELOAD=/usr/lib/libSegFault.so ./powerplant
*** Segmentation fault
Register dump:

 RAX: 0000000000000000   RBX: 0000000000000000   RCX: 0000000000000000
 RDX: 00007ffdf6aba5a8   RSI: 00007ffdf6aba598   RDI: 0000000000000000
 RBP: 00007ffdf6aba480   R8 : 000055d2ad5e16b0   R9 : 00007f98534729d0
 R10: 0000000000000008   R11: 0000000000000246   R12: 000055d2ad5e14f0
 R13: 00007ffdf6aba590   R14: 0000000000000000   R15: 0000000000000000
 RSP: 00007ffdf6aba480

 RIP: 000055d2ad5e1606   EFLAGS: 00010206

 CS: 0033   FS: 0000   GS: 0000

 Trap: 0000000e   Error: 00000006   OldMask: 00000000   CR2: 00000000

 FPUCW: 0000037f   FPUSW: 00000000   TAG: 00000000
 RIP: 00000000   RDP: 00000000

 ST(0) 0000 0000000000000000   ST(1) 0000 0000000000000000
 ST(2) 0000 0000000000000000   ST(3) 0000 0000000000000000
 ST(4) 0000 0000000000000000   ST(5) 0000 0000000000000000
 ST(6) 0000 0000000000000000   ST(7) 0000 0000000000000000
 mxcsr: 1f80
 XMM0:  00000000000000000000000000000000 XMM1:  00000000000000000000000000000000
 XMM2:  00000000000000000000000000000000 XMM3:  00000000000000000000000000000000
 XMM4:  00000000000000000000000000000000 XMM5:  00000000000000000000000000000000
 XMM6:  00000000000000000000000000000000 XMM7:  00000000000000000000000000000000
 XMM8:  00000000000000000000000000000000 XMM9:  00000000000000000000000000000000
 XMM10: 00000000000000000000000000000000 XMM11: 00000000000000000000000000000000
 XMM12: 00000000000000000000000000000000 XMM13: 00000000000000000000000000000000
 XMM14: 00000000000000000000000000000000 XMM15: 00000000000000000000000000000000

Backtrace:
./powerplant(+0x606)[0x55d2ad5e1606]
./powerplant(+0x628)[0x55d2ad5e1628]
./powerplant(+0x639)[0x55d2ad5e1639]
/usr/lib/libc.so.6(__libc_start_main+0xea)[0x7f9852ec6f6a]
./powerplant(+0x51a)[0x55d2ad5e151a]

Memory map:

55d2ad5e1000-55d2ad5e2000 r-xp 00000000 00:13 14897704                   /home/marcus/tmp/segfault/powerplant
55d2ad7e1000-55d2ad7e2000 r--p 00000000 00:13 14897704                   /home/marcus/tmp/segfault/powerplant
55d2ad7e2000-55d2ad7e3000 rw-p 00001000 00:13 14897704                   /home/marcus/tmp/segfault/powerplant
55d2ada9c000-55d2adabd000 rw-p 00000000 00:00 0                          [heap]
7f9852c8f000-7f9852ca5000 r-xp 00000000 00:13 13977863                   /usr/lib/libgcc_s.so.1
7f9852ca5000-7f9852ea4000 ---p 00016000 00:13 13977863                   /usr/lib/libgcc_s.so.1
7f9852ea4000-7f9852ea5000 r--p 00015000 00:13 13977863                   /usr/lib/libgcc_s.so.1
7f9852ea5000-7f9852ea6000 rw-p 00016000 00:13 13977863                   /usr/lib/libgcc_s.so.1
7f9852ea6000-7f9853054000 r-xp 00000000 00:13 13975885                   /usr/lib/libc-2.26.so
7f9853054000-7f9853254000 ---p 001ae000 00:13 13975885                   /usr/lib/libc-2.26.so
7f9853254000-7f9853258000 r--p 001ae000 00:13 13975885                   /usr/lib/libc-2.26.so
7f9853258000-7f985325a000 rw-p 001b2000 00:13 13975885                   /usr/lib/libc-2.26.so
7f985325a000-7f985325e000 rw-p 00000000 00:00 0
7f985325e000-7f9853262000 r-xp 00000000 00:13 13975827                   /usr/lib/libSegFault.so
7f9853262000-7f9853461000 ---p 00004000 00:13 13975827                   /usr/lib/libSegFault.so
7f9853461000-7f9853462000 r--p 00003000 00:13 13975827                   /usr/lib/libSegFault.so
7f9853462000-7f9853463000 rw-p 00004000 00:13 13975827                   /usr/lib/libSegFault.so
7f9853463000-7f9853488000 r-xp 00000000 00:13 13975886                   /usr/lib/ld-2.26.so
7f9853649000-7f985364c000 rw-p 00000000 00:00 0
7f9853685000-7f9853687000 rw-p 00000000 00:00 0
7f9853687000-7f9853688000 r--p 00024000 00:13 13975886                   /usr/lib/ld-2.26.so
7f9853688000-7f9853689000 rw-p 00025000 00:13 13975886                   /usr/lib/ld-2.26.so
7f9853689000-7f985368a000 rw-p 00000000 00:00 0
7ffdf6a9b000-7ffdf6abc000 rw-p 00000000 00:00 0                          [stack]
7ffdf6bc7000-7ffdf6bc9000 r--p 00000000 00:00 0                          [vvar]
7ffdf6bc9000-7ffdf6bcb000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]

At a first glance, the information may feel overwelming, but lets go through the most importat lines.

The backtrace lists the call chain when the the signal was delivered to the application. The first entry is on top of the stack

Backtrace:
./powerplant(+0x606)[0x55d2ad5e1606]
./powerplant(+0x628)[0x55d2ad5e1628]
./powerplant(+0x639)[0x55d2ad5e1639]
/usr/lib/libc.so.6(__libc_start_main+0xea)[0x7f9852ec6f6a]
./powerplant(+0x51a)[0x55d2ad5e151a]

Here we can see that the last executed instruction is at address 0x55d2ad5e1606. The tricky part is that the address is not absolute in the application, but virtual for the whole process. In other words, we need to calculate the address to an offset within the application's .text segment. If we look at the Memory map we see three entries for the powerplant application:

55d2ad5e1000-55d2ad5e2000 r-xp 00000000 00:13 14897704                   /home/marcus/tmp/segfault/powerplant
55d2ad7e1000-55d2ad7e2000 r--p 00000000 00:13 14897704                   /home/marcus/tmp/segfault/powerplant
55d2ad7e2000-55d2ad7e3000 rw-p 00001000 00:13 14897704                   /home/marcus/tmp/segfault/powerplant

Why three? Most ELF files (application or library) has at least three memory mapped sections: - .text, The executable code - .rodata, read only data - .data, read/write data

With help of the permissions it's possible to figure out which mapping correspond to each section.

The last mapping has rw- as permissions and is probably our .data section as it allows both write and read. The middle mapping has r-- and is a read only mapping - probably our .rodata section. The first mapping has r-x which is read-only and executable. This must be our .text section!

Now we can take the address from our backtrace and subtract with the offset address for our .text section: 0x55d2ad5e1606 - 0x55d2ad5e1000 = 0x606

Use addr2line to get the corresponding line our source code

$ addr2line -e ./powerplant -a 0x606
    0x0000000000000606
    /home/marcus/tmp/segfault/main.c:3

If we go back to the source code, we see that line 3 in main.c is

*rod = 0xAB;

Here we have it. Nothing more to say.

Conclusion

libSegFault.so has been a great help over a long time. The biggest benefit's that you don't have to recompile your application when you want to use the feature. However, you cannot get the line number from addr2line if the application is not compiled with debug symbols, but often it is not that hard to figure out the context out from a dissassembly of your application.