Re: Top kernel oopses/warnings this week

From: Arjan van de Ven
Date: Mon Dec 17 2007 - 16:37:55 EST


On Mon, 17 Dec 2007 18:23:31 +0100
Ingo Molnar <mingo@xxxxxxx> wrote:

>
> * Arjan van de Ven <arjan@xxxxxxxxxxxxxxx> wrote:
>
> > The http://www.kerneloops.org website collects kernel oops and
> > warning reports from various mailing lists and bugzillas; below is
> > a top 10 list of the oopses collected in the last 7 days. (Reports
> > prior to 2.6.23 have been omitted in collecting the top 10)
>
> cool stuff! I cannot over-emphasise how useful this will be.
>
> Let us know if you need any additional WARN_ON()s or other dmesg
> annotations to make parsing easier / more intelligent. At least as
> far as arch/x86 and the scheduler is related it's going to be applied
> to the fast-track queue ;-)
>

the following patch would help a lot; it ads a very nice parsable end-marker
to oopses, as well as printing the boot UUID as part of the oops, which
makes it easier to de-dupe oopses. The UUID is just a random number and not
privacy-tracable to any system.

--

Subject: [patch] terminate the oops printing with a defined string/uuid
From: Arjan van de Ven <arjan@xxxxxxxxxxxxxxx>

Right now, it's hard for automated tools to determine when an oops has
ended; there's no clear marker for this. In addition, there's no good
way to find out if an oops is unique. Sometimes it's the same oops
just reported multiple times, while other times it's a different
instance of the crash with the same signature. Printing the boot UUID
as part of the end string resolves this ambiguity.

Signed-off-by: Arjan van de Ven <arjan@xxxxxxxxxxxxxxx>
CC: Ted Ts'o <tytso@xxxxxxxxx>

---
drivers/char/random.c | 35 ++++++++++++++++++++++++++++++++++-
include/linux/random.h | 1 +
kernel/panic.c | 2 ++
3 files changed, 37 insertions(+), 1 deletion(-)

Index: linux-2.6.24-rc5/drivers/char/random.c
===================================================================
--- linux-2.6.24-rc5.orig/drivers/char/random.c
+++ linux-2.6.24-rc5/drivers/char/random.c
@@ -1176,8 +1176,41 @@ static int max_read_thresh = INPUT_POOL_
static int max_write_thresh = INPUT_POOL_WORDS * 32;
static char sysctl_bootid[16];

+/**
+ * get_boot_uuid - return a string pointer to a system wide boot UUID
+ *
+ * Returns a pointer to the boot UUID. This UUID is unique per system
+ * boot but persistent for one boot session.
+ *
+ * The memory returned via the return pointer is static allocated and
+ * owned by the random.c driver; this should not be kfree()'d.
+ *
+ * Locking: none
+ */
+ */
+char *get_boot_uuid(void)
+{
+ static char target[80];
+ unsigned char *uuid;
+
+ if (sysctl_bootid[8] == 0)
+ generate_random_uuid(sysctl_bootid);
+ /* sysctl_bootid is signed, to print we need unsigned .. */
+ uuid = sysctl_bootid;
+
+ if (target[0] == 0) {
+ sprintf(target, "%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-"
+ "%02x%02x%02x%02x%02x%02x",
+ uuid[0], uuid[1], uuid[2], uuid[3], uuid[4],
+ uuid[5], uuid[6], uuid[7], uuid[8], uuid[9],
+ uuid[10], uuid[11], uuid[12], uuid[13], uuid[14],
+ uuid[15]);
+ }
+ return target;
+}
+
/*
- * These functions is used to return both the bootid UUID, and random
+ * These functions are used to return both the bootid UUID, and random
* UUID. The difference is in whether table->data is NULL; if it is,
* then a new UUID is generated and returned to the user.
*
Index: linux-2.6.24-rc5/include/linux/random.h
===================================================================
--- linux-2.6.24-rc5.orig/include/linux/random.h
+++ linux-2.6.24-rc5/include/linux/random.h
@@ -71,6 +71,7 @@ unsigned long randomize_range(unsigned l

u32 random32(void);
void srandom32(u32 seed);
+char *get_boot_uuid(void);

#endif /* __KERNEL___ */

Index: linux-2.6.24-rc5/kernel/panic.c
===================================================================
--- linux-2.6.24-rc5.orig/kernel/panic.c
+++ linux-2.6.24-rc5/kernel/panic.c
@@ -19,6 +19,7 @@
#include <linux/nmi.h>
#include <linux/kexec.h>
#include <linux/debug_locks.h>
+#include <linux/random.h>

int panic_on_oops;
int tainted;
@@ -272,6 +273,7 @@ void oops_enter(void)
void oops_exit(void)
{
do_oops_enter_exit();
+ printk("---[ end of trace %s ]---\n", get_boot_uuid());
}

#ifdef CONFIG_CC_STACKPROTECTOR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/