Re: [Bug #13941] x86 Geode issue

From: Ingo Molnar
Date: Sun Aug 16 2009 - 16:57:31 EST



* Martin-Éric Racine <q-funk@xxxxxx> wrote:

> On Thu, Aug 13, 2009 at 9:34 PM, Rafael J. Wysocki<rjw@xxxxxxx> wrote:
> > On Thursday 13 August 2009, Martin-Éric Racine wrote:
> >> On Thu, Aug 13, 2009 at 5:54 PM, Rafael J. Wysocki<rjw@xxxxxxx> wrote:
> >> > On Thursday 13 August 2009, Martin-Éric Racine wrote:
> >> >> 2009/8/13 Martin-Éric Racine <q-funk@xxxxxx>:
> >> >> > On Thu, Aug 13, 2009 at 12:07 PM, Ingo Molnar<mingo@xxxxxxx> wrote:
> >> >> >> * Martin-Éric Racine <q-funk@xxxxxx> wrote:
> >> >> >>> Yes, this bug is still valid.
> >> >> >>>
> >> >> >>> Ubuntu kernel team member Leann Ogasawara and I are slowly
> >> >> >>> bisecting our way through the changes that took place since 2.6.30
> >> >> >>> to find the commit that introduced this regression. Please stay
> >> >> >>> tuned.
> >> >> >>
> >> >> >> hm, the only outright Geode related commit was:
> >> >> >>
> >> >> >>  d6c585a: x86: geode: Mark mfgpt irq IRQF_TIMER to prevent resume failure
> >> >> >>
> >> >> >> the jpg at:
> >> >> >>
> >> >> >>  http://launchpadlibrarian.net/28892781/00002.jpg
> >> >> >>
> >> >> >> is very out of focus - but what i could decypher suggests a
> >> >> >> pagefault crash in the VFS code, in generic_delete_inode().
> >> >>
> >> >> This one might be a bit better:
> >> >>
> >> >> http://launchpadlibrarian.net/30267494/2.6.31-5.24.jpg
> >
> > Hmm.  This looks like a sysfs oops to my untrained eye.
>
> The bisect I did with Leann Ogasawara has narrowed the kernel panic
> down to the following:
>
> commit f19d4a8fa6f9b6ccf54df0971c97ffcaa390b7b0
> Author: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
> Date: Mon Jun 8 19:50:45 2009 -0400
>
> add caching of ACLs in struct inode
>
> No helpers, no conversions yet.
>
> Signed-off-by: Al Viro <viro@xxxxxxxxxxxxxxxxxx>

Weird. If the functions do what their name suggests, i.e. if
inode_init_always() is an always called constructor and if
destroy_inode() is an unconditional destructor then this patch
should have no functional effect on the VFS side.

It increases the size of struct inode, so if you have some old
module (built to an older version of fs.h) still around it might
corrupt your inode data structure.

Or the size change might trigger some dormant bug. It might move a
critical inode right into the path of a pre-existing (but not
visibly crash-triggering) data corruption.

The possibilities on the 'weird bug' front are endless - the
crash/oops itself should be turned into text, posted here and
analyzed.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/