Chapter 9. Reporting and handling bugs

Table of Contents

9.1. Bug handling policy for the kernel team
9.1.1. Required information
9.1.2. Severities
9.1.3. Tagging
9.1.4. Analysis by maintainers
9.1.5. Testing by submitter
9.1.6. Keeping bugs separate
9.1.7. Applying patches
9.1.8. Talking to submitters
9.2. Filing a bug against a kernel package
9.2.1. Bisecting (finding the upstream version that introduced a bug)

9.1. Bug handling policy for the kernel team

9.1.1. Required information

Submitters are expected to run reportbug or other tool that runs our bug script under the kernel version in question. The response to reports without this information should be a request to follow-up using reportbug. If we do not receive this information within a month of the request, the bug may be closed.

Exceptions:

  • If the kernel does not boot or is very unstable, instead of the usual system information we need the console messages via netconsole, serial console, or a photograph.

  • If the report is relaying information about a bug acknowledged upstream, we do not need system information but we do need specific references (bugzilla.kernel.org or git commit id).

  • If the bug is clearly not hardware-specific (e.g. packaging error), we do not need system information.

  • If the bug is reported against a well-defined model, we may not need device listings.

9.1.2. Severities

Many submitters report bugs with the wrong severity. We interpret the criteria as follows and will adjust severity as appropriate:

critical: makes unrelated software on the system (or the whole system) break...

The bug must make the kernel unbootable or unstable on common hardware or all systems that a specific flavour is supposed to support. There is no 'unrelated software' since everything depends on the kernel.

grave: makes the package in question unusable or mostly so...

If the kernel is unusable, this already qualifies as critical.

grave: ...or causes data loss...

We exclude loss of data in memory due to a crash. Only corruption of data in storage or communication, or silent failure to write data, qualifies.

important

We include lack of support for new hardware that is generally available.

9.1.3. Tagging

We do not use user-tags. In order to aid bug triage we should make use of the standard tags and forwarded field defined by the BTS. In particular:

  • Add moreinfo whenever we are waiting for a response from the submitter and remove it when we are not

  • Do not add unreproducible to bugs that may be hardware-dependent

9.1.4. Analysis by maintainers

Generally we should not expect to be able to reproduce bugs without having similar hardware. We should consider:

  • Searching bugzilla.kernel.org (including closed bugs) or other relevant bug tracker

  • Searching kernel mailing lists

  • Viewing git commit logs for relevant source files

    • In case of a regression, from the known good to the bad version

    • In other cases, from the bad version forwards, in case the bug has been fixed since

  • Searching kerneloops.org for similar oopses

  • Matching the machine code and registers in an 'oops' against the source and deducing how the impossible happened (this doesn't work that often but when it does you look like a genius ;-)

9.1.5. Testing by submitter

Depending on the technical sophistication of the submitter and the service requirements of the system in question (e.g. whether it's a production server) we can request one or more of the following:

  • Gathering more information passively (e.g. further logging, reporting contents of files in procfs or sysfs)

  • Upgrading to the current stable/stable-proposed-updates/stable-security version, if it includes a fix for a similar bug

  • Adding debug or fallback options to the kernel command line or module parameters

  • Installing the unstable or backports version temporarily

  • Rebuilding and installing the kernel with a specific patch added (the script debian/bin/test-patches should make this easy)

  • Using git bisect to find a specific upstream change that introduced the bug

When a bug occurs in what upstream considers the current or previous stable release, and we cannot fix it, we ask the submitter to report it upstream at bugzilla.kernel.org under a specific Product and Component, and to tell us the upstream bug number. We do not report bugs directly because follow-up questions from upstream need to go to the submitter, not to us. Given the upstream bug number, we mark the bug as forwarded. bts-link then updates its status.

9.1.6. Keeping bugs separate

Many submitters search for a characteristic error message and treat this as indicating a specific bug. This can lead to many 'me too' follow-ups where, for example, the message indicates a driver bug and the second submitter is using a different driver from the original submitter.

In order to avoid the report turning into a mess of conflicting information about two or more different bugs:

  • We should try to respond to such a follow-up quickly, requesting a separate bug report

  • We can use the BTS summary command to improve the description of the bug

  • As a last resort, it may be necessary to open new bugs with the relevant information, set their submitters accordingly, and close the original report

Where the original report describes more than one bug ('...and other thing...'), we should clone it and deal with each separately.

9.1.7. Applying patches

Patches should normally be reviewed and accepted by the relevant upstream maintainer (aside from necessary adjustments for an older kernel version) before being applied.

9.1.8. Talking to submitters

We should always be polite to submitters. Not only is this implied by the Social Contract, but it is likely to lead to a faster resolution of the bug. If a submitter overrated the severity, quietly downgrade it. If a submitter has done something stupid, request that they undo that and report back. 'Sorry' and 'please' make a big difference in tone.

We will maintain general advice to submitters at https://wiki.debian.org/DebianKernelReportingBugs.

9.2. Filing a bug against a kernel package

Debian kernel team keeps track of the kernel package bugs in the Debian Bug Tracking System (BTS). For information on how to use the system see https://bugs.debian.org. You can also submit the bugs by using the reportbug command from the package with the same name. Please note that kernel bugs found in distributions derived from Debian (such as Knoppix, Mepis, Progeny, Ubuntu, Xandros, etc.) should not be reported to the Debian BTS (unless they can be also reproduced on a Debian system using official Debian kernel packages). Derived distributions have their own policies and procedures regarding kernel packaging, so the bugs found in them should be reported directly to their bug tracking systems or mailing lists.

Nothing in this chapter is intended to keep you from filing a bug against one of the Debian kernel packages. However, you should recognize that the resources of the Debian kernel team are limited, and efficient reaction to a bug is largely determined by the amount and quality of the information included in the bug report. Please help us to do a better job by using the following guidelines when preparing to file the bug against kernel packages:

  • Do the research. Before filing the bug search the web for the particular error message or symptom you are getting. As it is highly unlikely that you are the only person experiencing a particular problem, there is always a chance that it has been discussed elsewhere, and a possible solution, patch, or workaround has been proposed. If such information exists, always include the references to it in your report. Check the current bug list to see whether something similar has been reported already.

  • Collect the information. Please provide enough information with your report. At a minimum, it should contain the exact version of the official Debian kernel package, where the bug is encountered, and steps to reproduce it. Depending on the nature of the bug you are reporting, you might also want to include the output of dmesg (or portions thereof), output of the lspci -vn. reportbug will do this automatically. If applicable, include the information about the latest known kernel version where the bug is not present, and output of the above commands for the working kernel as well. Use common sense and include other relevant information, if you think that it might help in solving the problem.

  • Try to reproduce the problem with "vanilla" kernel. If you have a chance, try to reproduce the problem by building the binary kernel image from the "vanilla" kernel source, available from https://www.kernel.org or its mirrors, using the same configuration as the Debian stock kernels. For more information on how to do this, look at Section 4.5, “Building a custom kernel from Debian kernel source”. If there is convincing evidence that the buggy behavior is caused by the Debian-specific changes to the kernel, the bug will usually be assigned higher priority by the kernel team. If the bug is not specific for Debian, check out the upstream kernel bug database to see if it has been reported there. If you are sure that it is an upstream problem, you can also report your bug there (but submit it to Debian BTS anyway, so that we can track it properly).

  • Use the correct package to report the bug against. Please file bugs against the package containing the kernel version where the problem occurs (e.g. linux-image-3.2.0-2-686-pae), not a metapackage (e.g. linux-image-686-pae).

  • Bugs involving tainted kernels. If a kernel crashes, it normally prints out some debugging information, indicating, among other things, whether the running kernel has been tainted. The kernel is referred to as tainted if at the time of the crash it had some binary third-party modules loaded. As kernel developers do not have access to the source code for such modules, problems involving them are notoriously difficult to debug. It is therefore strongly recommended to try and reproduce the problem with an untainted kernel (by preventing the loading of binary modules, for example). If the problem is due to the presence of such modules, there is not much the kernel community can do about it and it should be reported directly to their authors.

9.2.1. Bisecting (finding the upstream version that introduced a bug)

When a bug is easy to reproduce locally but hard to get developers to reproduce (as is often true of workflow- or hardware-dependent bugs), it can be useful to compile and test a few versions to narrow down what changes introduced the regression.

To start, recreate the problem with a vanilla kernel:

# apt-get install git build-essential
$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
$ cd linux

The above commands acquire a vanilla kernel. Configure, build and test a binary package as explained in Section 4.5, “Building a custom kernel from Debian kernel source”:

$ make localmodconfig  # minimal configuration
$ scripts/config --disable DEBUG_INFO  # to keep the build reasonably small
$ make deb-pkg
# dpkg -i ../linux-image-3.5.0_3.5.0-1_i386.deb  # substitute package name from the previous command
# reboot

If the bug doesn't show up, try again with the official configuration file from /boot. (If it still doesn't show up after that, declare victory and celebrate.)

Initialize the bisection process by declaring which versions worked and did not work:

$ cd linux
$ git bisect start
$ git bisect good v3.0  # or whichever was known to be good
$ git bisect bad  # current version is bad

Now git checks out a version half-way in between to test. Build it, reusing the prepared configuration.

$ make silentoldconfig
$ make deb-pkg

Install the package, reboot, and test.

$ git bisect good  # if this version doesn't exhibit the bug
$ git bisect bad  # if it does
$ git bisect skip  # if some other bug makes it hard to test

And on to the next iteration:

$ make silentoldconfig
$ make deb-pkg

At the end of the process, the name of the "first bad commit" is printed, which is very useful for tracking down the bug. Narrowing down the regression range with a few rounds is useful even if you don't get that far; in that case, run git bisect log to produce a log. If you are the visual sort of person, git bisect visualize with the gitk package installed can show what is happening between steps.

See Christian Couder's article "Fighting regressions with git bisect" from kernel.org or the git-doc package for details.