Re: oprofile BUG() in current kernel.

From: Andrew Morton
Date: Tue May 13 2008 - 04:41:48 EST


On Mon, 12 May 2008 11:38:03 -0500 Chris J Arges <arges@xxxxxxxxxxxxxxxxxx> wrote:
>
> > >>>
> > >>>
> > >> This looks similar to:
> > >>
> > >> http://www.uwsg.iu.edu/hypermail/linux/kernel/0805.0/2845.html
> > >>
> > >
> > > Yes, remarkably similar ;)
> > >
> > >
> > >> Does reverting 608dfddd845da5ab6accef70154c8910529699f7 fix it for you too?
> > >>
> > Has this fix been officially reverted?
>
> Let me know if this change is going to be reverted, as I have a patch
> ready to support cpu hotplug for oprofile based on code post
> DEFINE_PER_CPU patch.

Please don't top-post. I repaired it so that I could reply sensibly.

In trying to reprocude this on a uniprocessor machine, it seems that
someone broke oprofile:

/usr/bin/opcontrol: line 911: /dev/oprofile/0/enabled: No such file or directory
/usr/bin/opcontrol: line 911: /dev/oprofile/0/event: No such file or directory
/usr/bin/opcontrol: line 911: /dev/oprofile/0/count: No such file or directory
/usr/bin/opcontrol: line 911: /dev/oprofile/0/kernel: No such file or directory
/usr/bin/opcontrol: line 911: /dev/oprofile/0/user: No such file or directory
/usr/bin/opcontrol: line 911: /dev/oprofile/0/unit_mask: No such file or directo

sony:/home/akpm> l /dev/oprofile
total 0
drwxr-xr-x 1 root root 0 May 13 01:25 1
-rw-r--r-- 1 root root 0 May 13 01:25 backtrace_depth
-rw-r--r-- 1 root root 0 May 13 01:25 buffer
-rw-r--r-- 1 root root 0 May 13 01:25 buffer_size
-rw-r--r-- 1 root root 0 May 13 01:25 buffer_watershed
-rw-r--r-- 1 root root 0 May 13 01:25 cpu_buffer_size
-rw-r--r-- 1 root root 0 May 13 01:25 cpu_type
-rw-rw-rw- 1 root root 0 May 13 01:25 dump
-rw-r--r-- 1 root root 0 May 13 01:25 enable
-rw-r--r-- 1 root root 0 May 13 01:25 pointer_size
drwxr-xr-x 1 root root 0 May 13 01:25 stats

Looks like the "0" got renamed to "1". Who did that?



So then I try it on the old 2-way:

No event named GLOBAL_POWER_EVENTS is available.
No event named GLOBAL_POWER_EVENTS is available.
No event named GLOBAL_POWER_EVENTS is available.
No event named GLOBAL_POWER_EVENTS is available.
No event named GLOBAL_POWER_EVENTS is available.

so that got broken too.

I queued the revert of 608dfddd845da5ab6accef70154c8910529699f7,
although that doesn't fix these regressions.

I see no oops. And I don't see what's wrong with the fairly simple
per-cpu conversion, so I'd rather not revert what appears to be a good
patch when we don't understand what's going wrong.


Grasping at straws, we have had problems with per-cpu variable
initialisation in the past. Does this

--- a/drivers/oprofile/cpu_buffer.c~a
+++ a/drivers/oprofile/cpu_buffer.c
@@ -27,7 +27,7 @@
#include "buffer_sync.h"
#include "oprof.h"

-DEFINE_PER_CPU_SHARED_ALIGNED(struct oprofile_cpu_buffer, cpu_buffer);
+DEFINE_PER_CPU_SHARED_ALIGNED(struct oprofile_cpu_buffer, cpu_buffer) = { };

static void wq_sync_buffer(struct work_struct *work);

_

fix anything for anyone?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/