Skip to content

Linux VM Kernel Debugging

Ross Philipson edited this page Feb 8, 2015 · 50 revisions

These are some handy steps for doing Linux live kernel debugging on the OpenXT platform. OpenXT makes this readily doable by using serial over IP within QEMU to connect the remote debugger GDB to the kernel debugger KGDB. Of course everyone has their favorite distro but for this we will stick to just one – Debian. Most of the steps should be the same with other flavors aside from the package management bits and specifics of rebuilding the kernel sources.

Beforehand

Here are some things to setup up first and to keep in mind while going through this:

  • This is based on using Debian Wheezy HVMs. That means apt package manager and .deb package files.
  • Throughout this guide:
  • target is the VM that is being kernel debugged
  • host is the debugger, the VM where GDB is remotely connecting to the target.
  • For simplicity, it is assumed both the host and target are the same OS or distro. The paths and users on both are the same. Also password-less SSH login and sudo setup are assumed.
  • A basic set of development tools is needed on the target to build the kernel - apt-get install build-essential should be sufficient (note sometimes dpkg-dev needs to be installed manually).

So to get started, install OpenXT and create 2 Debian Wheezy HVMs. For debugging to work, SELinux and stubdoms need to be turned off. Stubdoms are disabled in the Advanced tabs for the VMs. To disable SELinux, run a terminal as root, use nr to log into the admin role then run rw to make the rootfs read-write. Next edit /etc/selinux/config and set SELINUX=permissive and save. Reboot.

Building the Kernel

The KGDB debugger components need to be enabled in the target kernel. This requires building a custom kernel. On the target, get the kernel source package - in this case for Wheezy:

$ sudo apt-get install linux-source-3.2

This drops off a tarball /usr/srclinux-source-3.2.tar.bz2. Make a directory called ~/kernel and extract the tarball there. Change to ~/kernel/linux-source-3.2 that has the kernel sources.

Next the kernel sources need a configuration file. The simplest thing is to start with the one for the currently installed Wheezy kernel. Copy /boot/config-3.2.0-4-amd64 as .config in the current sources dir (note the current config file might have a different name). These are the setting that should be enabled/disabled in the .config:

# CONFIG_DEBUG_RODATA is not set
CONFIG_DEBUG_INFO=y
CONFIG_FRAME_POINTER=y
CONFIG_KGDB=y
CONFIG_KGDB_SERIAL_CONSOLE=y

Briefly, these are the base set of kernel debug features that need to be modified. CONFIG_DEBUG_RODATA marks the .text section RO and prevent KGDB from emitting break-point instruction in the code. CONFIG_DEBUG_INFO enables debug information/symbolic data in the kernel image. CONFIG_FRAME_POINTER preserves stack frame pointers makes stack back-tracing and changing frames easier. The last two enable the KGDB debugger extensions using a serial console.

Though you can edit .config, it is usually done using one of the editing interfaces. Using make menuconfig, the setting are here:

  • "Kernel hacking"
  • "Compile the kernel with debug info"
  • "Compile the kernel with frame pointers"
  • "KGDB: kernel debugger --->"
  • "KGDB: use kgdb over the serial console"
  • "Write protect kernel read-only data structures"

Other debugging features may be enabled/disabled at this point too. It is also recommended that a local version suffix is used for the kernel name to help identify the build. Under "General setup" enter a suffix in "Local version" (.e.g. "-kgdb"). Once all this is done, save the configuration and:

$ make deb-pkg

This will produce 3 packages in the ~/kernel dir:

  • linux-image-3.2.65-kgdb_3.2.65-kgdb-1_amd64.deb
  • linux-headers-3.2.65-kgdb_3.2.65-kgdb-1_amd64.deb
  • linux-libc-dev_3.2.65-kgdb-1_amd64.deb

Installing the Kernel

This is straight forward, simply install the .deb packages:

$ sudo dpkg -i linux-image-3.2.65-kgdb_3.2.65-kgdb-1_amd64.deb
$ sudo dpkg -i linux-headers-3.2.65-kgdb_3.2.65-kgdb-1_amd64.deb

This will create a several files in /boot that can be identified by the local version suffix used (the kernel image, copy of the config, initramfs and system map) and install the modules to /lib/modules as expected. The grub boot loader config will also be updated. The headers will end up under /usr/src.

The new kernel will need a couple of kernel command line parameters to tell KGDB how to connect over serial. Edit /etc/default/grub and set GRUB_CMDLINE_LINUX="kgdboc=ttyS0,115200 kgdbcon". Note if there were already values set, the KGDB ones can be appended. Then update grub and reboot:

$ sudo update-grub; reboot

Setup the Debugger

In the host VM, first install the debugger with apt-get install gdb. Next the host needs to have the built kernel sources from the target for GDB to work on. The best way to get this (and to synchronize further changes) is to just rsync it. Create a ~/kernel dir and sync up the sources and build output:

$ rsync myuser@mytarget:~/kernel/linux-source-3.2 ~/kernel

The host is now ready to do some debugging stuffs. The last step is to connect the emulated serial ports from the two VMs.

Connecting Serial Ports

The emulated serial ports in QEMU can be setup to pipe their communications over TCP/IP - this is how the two VMs will be connected. This could be done either way but in this example the emulated serial port for the host VM will be the listening end and the target will connect. The next steps need to be done in a terminal in dom0 as root. Run this to setup the host VM:

$ xec-vm -n <host-vm-name> set extra-xenvm "serial=tcp::4545,server,nowait"

And this to setup the target:

$ xec-vm -n <target-vm-name> set extra-xenvm "serial=tcp:0.0.0.0:4545"

The port is arbitrary but 4545 works just fine. Now, since the host is the listening side, it needs to be started first, then the target. It could be reversed if need be. Start them both up.

Remote Kernel Debugging

Everything is in place to do some actual debugging. To ready the target, the kernel execution must be halted and ready to receive a connection from the debugger this is done two ways. One is to add the kgdbwait parameter to the kernel command line in addition to the parameter that were added earlier. More on that parameter can be found in the KGDB documentation. The second which will be used here is to use the magix of sysrq. In the target VM, open a terminal as root and:

$ echo g > /proc/sysrq-trigger

The VM will freeze and the kernel is now waiting for a debugger connection. Over in the host VM, open a terminal as root and cd to the location of the kernel sources (e.g. /home/myuser/kernel/linux-source-3.2). Start the debugger:

$ gdb ./vmlinux

This will load the kernel image and symbols. At the GDB command prompt, connect to the remote target VM:

(gdb) set remotebaud 115200
(gdb) target remote /dev/ttyS0

Unless something is wrong, GDB should now be connected and in control of the remote kernel. Run a command like bt to see a stack back-trace of where the kernel is waiting for GDB. Refer to the GDB documentation for further commands. To let the kernel resume in the target, type c. The sysrq-trigger can then be used to break to the debugger again.

Modifications can be made to the kernel image on the target, the changes can be rsync'ed to the host. Then the target can be rebooted and kernel debugging can be resumed.

Modules and a Practical Example

The instructions above are for debugging the kernel image itself. This would be the core guts of the the kernel and any built-in modules. While this is good, being able to debug external modules is quite useful too. A practical example of building one of the OpenXT PV Linux (in this case xc-v4v.ko) drivers will be used to show how to debug a module.

All of the information above is still applicable but to debug modules, a bit more is needed. The following wiki contains instructions for building the OpenXT PV Linux drivers - this should be read first as background information:

PV Linux Drivers

On the target, after following those directions and rebooting, the OpenXT PV Linux drivers will be ready including several that will be loaded at boot time. Running sudo lsmod shows the xenbus driver xc-xen, the block and the net front drivers loaded. Others like xc-v4v.ko need to be manually loaded.

Assume for the example as with above, the same user has the drivers checked out in ~/pv-linux-drivers.git. Some fixes have been made to xc-v4v.ko and it was rebuilt in place using make. The built kernel module can be copied to the dkms modules location:

$ sudo cp ~/pv-linux-drivers.git/xc-v4v/xc-v4v.ko /lib/modules/3.2.65-kgdb/updates/dkms

Note that "kgdb" in the path is simply the example local version from above. Now the new driver is in place, the code for the drivers including the local changes on the target need to be sync'ed with the host VM so it has access to the binaries, symbols and source. Do this with rsync again:

$ rsync myuser@mytarget:~/pv-linux-drivers.git ~/

Finally load the module:

$ sudo modprobe xc_v4v

If all goes well the changes did not bring down the target VM, sudo lsmod should show xc_v4v loaded.

Debugging Modules

The trick with debugging modules is telling GDB where exactly the module got loaded in kernel space. More specifically it is telling GDB where specific sections of the module are loaded. The first step is determining that. In a terminal on the target with the xc-v4v.ko module loaded do:

$ ls -al /sys/module/xc_v4v/sections

Note the section names tend to start with a '.' so they get treated as hidden. The output should look like this:

-r--r--r-- 1 root root 4096 Feb  8 15:59 .bss
-r--r--r-- 1 root root 4096 Feb  8 15:58 __bug_table
-r--r--r-- 1 root root 4096 Feb  8 15:59 .data
-r--r--r-- 1 root root 4096 Feb  8 15:59 .devexit.text
-r--r--r-- 1 root root 4096 Feb  8 15:59 .devinit.text
-r--r--r-- 1 root root 4096 Feb  8 15:59 .exit.text
-r--r--r-- 1 root root 4096 Feb  8 15:59 .gnu.linkonce.this_module
-r--r--r-- 1 root root 4096 Feb  8 15:59 .init.text
-r--r--r-- 1 root root 4096 Feb  8 15:59 .note.gnu.build-id
-r--r--r-- 1 root root 4096 Feb  8 15:59 .rodata
-r--r--r-- 1 root root 4096 Feb  8 15:59 .rodata.str1.1
-r--r--r-- 1 root root 4096 Feb  8 15:59 .smp_locks
-r--r--r-- 1 root root 4096 Feb  8 15:59 .strtab
-r--r--r-- 1 root root 4096 Feb  8 15:59 .symtab
-r--r--r-- 1 root root 4096 Feb  8 15:59 .text
-r--r--r-- 1 root root 4096 Feb  8 15:59 .text.unlikely

This is the list of all the sections from the ELF binary module that were loaded. The debugger needs to know where .text is. Optionally other sections can be specified too. For the example, .data and .bss are also going to be specified. The sysfs files for the sections report their base address in kernel space:

$ cat /sys/module/xc_v4v/sections/.text
0xffffffffa0395000
$ cat /sys/module/xc_v4v/sections/.data
0xffffffffa039a000
$ cat /sys/module/xc_v4v/sections/.bss
0xffffffffa039a3e0

Those addresses are needed on the host for GDB. Follow the steps above to get GDB connected to the target. Now within GDB, the symbols for xc-v4v.ko can be added to the base set of symbols the debugger knows about for the vmlinux image. To add the symbols:

(gdb) add-symbol-file /home/myuser/pv-linux-drivers.git/xc-vusb/xc-vusb.ko 0xffffffffa0395000 -s .data 0xffffffffa039a000 -s .bss 0xffffffffa039a3e0

A friendly message should say the symbols were loaded. The debugger now knows what is where in xc-v4v.ko. Again the GDB docs should be consulted on actually using the debugger.

One final note. To ensure the symbolic information is present in the kernel components that will be debugged, make sure that these are built using the -g gcc flag. In addition, turning optimization down or off often makes debugging easier. Turning them off can be done by setting the gcc flag -O0. Another handy flag that makes the stack easier to work with is -fno-omit-frame-pointer making function calls save the stack frames and pointers. So for example the modified line for the Makefile in ```pv-linux-drivers.git`` looks like:

make -C $(KDIR) FROM_DKMS=n NOSTDINC_FLAGS="$(ENOSTDINC_FLAGS)" M=$(CURDIR) modules EXTRA_CFLAGS="-g -O0 -fno-omit-frame-pointer"

Good luck debugging...