Re: pci: kernel crash in bus_find_device
From: Greg Kroah-Hartmann
Date: Thu May 22 2014 - 22:32:08 EST
On Thu, May 22, 2014 at 12:22:40AM -0700, Guenter Roeck wrote:
> On 05/22/2014 12:14 AM, Greg Kroah-Hartmann wrote:
> > On Wed, May 21, 2014 at 03:59:58PM -0700, Guenter Roeck wrote:
> >> On Wed, May 21, 2014 at 01:04:04PM -0700, Francesco Ruggeri wrote:
> >>> I have been using an x86 platform.
> >>> When I started working on it I got early crashes until I added the
> >>> check for p not NULL in
> >>>
> >>> +void bus_release_device(struct device *dev)
> >>> +{
> >>> + struct device_private *p = dev->p;
> >>> +
> >>> + if (p && klist_node_attached(&p->knode_bus))
> >>> + klist_put_last(&p->knode_bus);
> >>> +}
> >>> +
> >>>
> >>> Maybe on powerpc *p is overriden between device_del and device_release?
> >>>
> >>> Or maybe some of the BUG_ONs in the patch? The ones on knode_dead are
> >>> treated as WARN_ONs in the current klist code.
> >>> The one in BUG_ON(!klist_dec_and_del(n)); is new, and in my tests I
> >>> ran into it without the second patch (but only when I ran my module
> >>> and tests).
> >>>
> >> Hi Francesco,
> >>
> >> I replaced the BUG_ON with WARN_ON; still crashes.
> >>
> >> Anyway, the problem seems to be known. I found two related exchanges.
> >>
> >> [1] describes pretty much the same problem. I don't see if/where it was
> >> ever fixed, though.
> >>
> >> [2] is a patch to fix the problem. It did not apply cleanly to 3.14,
> >> so I had to make some adjustments in klist_iter_init_node. Resulting
> >> patch is below. With this patch, the problem is gone. It is not perfect,
> >> as it aborts the loop if it encounters a deleted kobject, but it is better
> >> than nothing. Unfortunately, the patch never made it upstream; no idea why.
> >> Copying the author and Greg to get additional feedback.
> >>
> >> Guenter
> >>
> >> [1] https://lkml.org/lkml/2008/10/26/79
> >> [2] https://lkml.org/lkml/2012/4/16/218
> >
> > 2 years ago? I have no idea what was up with that, sorry...
> >
>
> Ok, but do you have comments on the patch itself in its current version ?
I have no idea, and at the moment, no time to look at this at all,
sorry. Feel free to work on it and verify if it is a valid fix or not
for this issue and let me know.
thanks,
greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/