Oops in fsnotify_mark (2.6.37, 3.0.6)

From: Valentin Avram
Date: Mon Nov 28 2011 - 12:49:34 EST


Hello.

Some of our servers experience an oops on auditd service restart.
All affected servers are Dell R610, running Gentoo Linux with kernels 2.6.37 and 3.0.6 (both gentoo patched).
After repeated auditd restarts, the kernel also logs warnings and finally the machine goes unresponsive with the kernel logging on the console CPU stalls.

The affected kernels are a 2.6.37 and a 3.0.6 (gentoo-sources package).

The 2.6.37-r4-gentoo kernel is basically kernel 2.6.37 + patches from mpagano from here:
http://dev.gentoo.org/~mpagano/genpatches/patches-2.6.37-6.htm
(
aka
http://dev.gentoo.org/~mpagano/genpatches/tarballs/genpatches-2.6.37-6.base.tar.bz2
http://dev.gentoo.org/~mpagano/genpatches/tarballs/genpatches-2.6.37-6.extras.tar.bz2
)

The 3.0.6-gentoo kernel is also the 3.0.6 kernel + mpagano patches from here:
http://dev.gentoo.org/~mpagano/genpatches/tarballs/genpatches-3.0-8.base.tar.bz2
http://dev.gentoo.org/~mpagano/genpatches/tarballs/genpatches-3.0-8.extras.tar.bz2

The oops seems to happen at random when restarting the auditd 2.1.3 (latest) daemon. Before the crash i can see the [fsnotify_mark] kernel thread, after the oops it is no more.

More data (kernel configs, oops and warning data, dmesg with CONFIG_DEBUG_INFO and CONFIG_DEBUG_LIST enabled, screenshots etc) can be found on the following Gentoo bug:
https://bugs.gentoo.org/show_bug.cgi?id=389405

Since the activity on the Gentoo bug thread is slow, maybe somebody from here has seen anything similar or has any idea what to do/test next.

Thank you for your time.


Attachment: smime.p7s
Description: S/MIME Cryptographic Signature