AW: Resend [PATCH] kernel/resource.c: invalidate parent when freed resource has childs

From: Schmid, Carsten
Date: Mon Aug 12 2019 - 04:46:25 EST


> -----Ursprüngliche Nachricht-----
> Von: Wei Yang [mailto:richard.weiyang@xxxxxxxxx]
> Gesendet: Samstag, 10. August 2019 02:45
> An: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> Cc: Wei Yang <richard.weiyang@xxxxxxxxx>; Schmid, Carsten
> <Carsten_Schmid@xxxxxxxxxx>; bp@xxxxxxx; dan.j.williams@xxxxxxxxx;
> mingo@xxxxxxxxxx; dave.hansen@xxxxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx; bhelgaas@xxxxxxxxxx; osalvador@xxxxxxx;
> rdunlap@xxxxxxxxxxxxx; richardw.yang@xxxxxxxxxxxxxxx;
> gregkh@xxxxxxxxxxxxxxxxxxx
> Betreff: Re: Resend [PATCH] kernel/resource.c: invalidate parent when
> freed resource has childs
>
> On Fri, Aug 09, 2019 at 03:45:59PM -0700, Linus Torvalds wrote:
> >On Fri, Aug 9, 2019 at 3:38 PM Wei Yang <richard.weiyang@xxxxxxxxx>
> wrote:
> >>
> >> In theory, child may have siblings. Would it be possible to have several
> >> devices under xhci-hcd?
> >
> >I'm less interested in the xhci-hcd case - which I certainly *hope* is
> >fixed already? - than in "if this happens somewhere else".
> >
>
> Agree, this is what I want to say.
>
Unfortunately this xhci-hcd case is not fixed yet.
I'm working on a driver fix proposal, discussing with Hans de Goede who
wrote the intel_xhci_usb_sw platform driver.

For me there is nothing invalid in the intel_xhci_usb_sw driver.
It is initialized from xhci-hcd/xhci-pci in
xhci_pci_probe --> xhci_ext_cap_init --> xhci_create_intel_xhci_sw_pdev
which then does
devm_add_action_or_reset(dev, xhci_intel_unregister_pdev, pdev)

So far, so good. Doesn't look bad.

When unbinding the xhci-hcd driver, i can see that xhci_pci_remove executes,
and after this the xhci_intel_unregister_pdev is called.
This is what fails because xhci_pci_remove removes the parent of the resource created
in the xhci_create_intel_xhci_sw_pdev.
So the "devm framework" which is used here fails in this specific case.
Very unintentional.
The proposal will call xhci_intel_unregister_pdev from within the xhci-pci in a similar way
like the driver is created. So we will have the child killed before the parent :)

> >So if we do want to remove the parent (which may be a good idea with a
> >warning), and want to make sure that the children are really removed
> >from the resource hierarchy, we should do somethiing like
> >
> > static bool detach_children(struct resource *res)
> > {
> > res = res->child;
> > if (!res)
> > return false;
> > do {
> > res->parent = NULL;
> > res = res->sibling;
> > } while (res);
> > return true;
> > }
> >
> >and then we could write the __release_region() warning as
> >
> > /* You should not release a resource that has children */
> > WARN_ON_ONCE(detach_children(res));
> >
Fine for me, this extended sanity check. This didn't came up to my mind.
Because i have a reproducer, i can test it and send it as V2.
If you have any additional ideas, let me know.

>
> I am thinking about why this could happen.
See above explanation.

>
> To guard the core kernel code, it looks reasonable.
>
Exactly my motivation :)

> >or something?
> >
> >NOTE! The above is entirely untested, and written purely in my mail
> >reader. It might be seriously buggy, including not compiling, or doing
> >odd things. See it more as a "maybe something like this" code snippet
> >example than any kind of final form.
> >
> > Linus
>
I'll implement and check it, of course. Development as usual.
Thanks!

Carsten
> --
> Wei Yang
> Help you, Help me