Re: threshold_init_device/kobject_uevent_env oops
From: Yinghai Lu
Date: Sat Jan 26 2008 - 02:09:27 EST
On Jan 25, 2008 10:14 PM, Greg KH <gregkh@xxxxxxx> wrote:
>
> On Fri, Jan 25, 2008 at 10:04:19PM -0800, Yinghai Lu wrote:
> > On Jan 25, 2008 2:50 PM, Greg KH <gregkh@xxxxxxx> wrote:
> > > On Fri, Jan 25, 2008 at 02:47:11PM -0800, Greg KH wrote:
> > > > On Fri, Jan 25, 2008 at 11:35:56PM +0100, Ingo Molnar wrote:
> > > > >
> > > > > * Greg KH <gregkh@xxxxxxx> wrote:
> > > > >
> > > > > > On Fri, Jan 25, 2008 at 01:05:40PM -0800, Yinghai Lu wrote:
> > > > > > > current linus tree + x86.git
> > > > > > >
> > > > > > > got
> > > > > > >
> > > > > > > Calling initcall 0xffffffff80b93d98: threshold_init_device+0x0/0x3f()
> > > > > > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
> > > > > > > IP: [<ffffffff80458e20>] kobject_uevent_env+0x2a/0x3d9
> > > > > >
> > > > > > Does this happen on just Linus's tree?
> > > > > >
> > > > > > Can you send me a .config file for this?
> > > > > >
> > > > > > What is threshold_init()? Is it something new in the x86.git tree?
> > > > >
> > > > > no. A quick grep shows that it is in a file that _your_ changes in
> > > > > Linus' latest have touched:
> > > > >
> > > > > arch/x86/kernel/cpu/mcheck/mce_amd_64.c
> > > >
> > > > Ok, those are pretty much just search/and/replace type changes, but I
> > > > have been running x86-64 boxes with these changes in place.
> > >
> > > Oh wait, I do see a change. We are now (finally) emitting a kobject
> > > uevent for these devices, which somehow the code can't handle properly.
> > >
> > > Let me go poke this some more, unfortunatly I don't have any AMD 64
> > > boxes here anymore, only Intel based processors, so I can't run this
> > > module...
> >
> > it only happens with AMD Quad Core CPU or Fam 10h.
> >
> > works well with AMD opteron Rev E, and Rev F.
>
> So this only dies on a multi-core system? Or does 2 processor boxes
> work, but not 4?
2 sockets x quad core will fail (fam 10h)
2 sockets x dual core works....( rev E, and rev F opteron)
there are some changs between opteron and fam10h. fam10h may have
more local vectors for MCE...
or more banks and blocks...
will look at AMD64 Bios and kernel porting guide for Fam 10h again..
wonder if your code uncover some bugs ...
YH
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/