Re: [PATCH] mmc: core: Don't allocate IDA for OF aliases

From: Stephen Boyd
Date: Fri Apr 16 2021 - 13:50:25 EST


Quoting Ulf Hansson (2021-04-16 00:17:10)
> On Thu, 15 Apr 2021 at 21:29, Stephen Boyd <swboyd@xxxxxxxxxxxx> wrote:
> >
> >
> > Don't think so. The device (with the kobject inside) is removed, and
> > thus the mmc1 device will be removed, but the kobject's release function
> > is delayed due to the config. This means that
> > mmc_host_classdev_release() is called at a later time. The only thing
> > inside that function is the IDA removal and the kfree of the container
> > object. Given that nothing else is in that release function I believe it
> > is safe to skip IDA allocation as it won't be blocking anything in the
> > reserved alias case.
> >
> > Furthermore, when the device is deleted in mmc_remove_host() there could
> > be other users of the device that haven't called put_device() yet.
> > Either way, those other users are keeping the device memory alive, but
> > otherwise device_del() has unlinked it from the various driver core
> > lists and sysfs has removed it too so it's in a state where code may be
> > referencing it but it's on the way out so users of the device will not
> > be able to do much with it during this time.
>
> Right, but see more below.
>
> >
> > This sort of problem (if it exists which I don't think it does) would
> > have been there all along and can't be fixed at this level. When a
> > device that has an alias calls the mmc_alloc_host() function twice it
> > gets two different device structures created so there are two distinct
> > kobjects that will need to be released at some point. The index is
> > usually different for those two kobjects, but with aliases it turns out
> > it is the same. When it comes to registering that device with the same
> > name the second one will fail because a device with that name already
> > exists on the bus. This would be really hard to do given that it would
> > need to be the same aliased device in DT calling the mmc_add_host()
> > function without calling mmc_remove_host() for the first one it added in
> > between.
>
> In fact, we have a few rare corner cases that can trigger KASAN splats
> when mmc_remove_host() gets executed. Similar splats can be triggered
> by just doing a sudden card removal. It's especially related to the
> cases, where a thread holds a reference to the card/host object when
> it's being removed. I am working on patches to fix these cases, but
> haven't yet decided on the best solution.

Ok. Can you share the KASAN reports? The memory allocated for this class
object and associated index from the IDA will be unique for each call
the mmc_alloc_host() so I don't see how KASAN could be noticing
something unless a reference to the host is leaking out without the
proper get_device() call being made to keep the underlying memory from
being freed.

>
> That's the reason why I was thinking that maybe returning
> -EPROBE_DEFER could be an option, but certainly I need to think more
> about this.

I was hoping that you wouldn't need to think more about returning
-EPROBE_DEFER after my email. :( Returning EPROBE_DEFER looks like it's
a bandaid for bigger problems with reference counting the pointer
allocated in mmc_alloc_host(). I hope I convinced you that there isn't
any danger in reusing the IDA in the case of an alias because the only
way that is a problem is if the same device calls mmc_alloc_host() twice
without calling mmc_remove_host() in between. That should be a pretty
obvious problem in driver code? The check to see if that same device has
tried to alloc a host twice would still effectively happen after this
patch because when mmc_add_host() tries to add that second device to the
driver core it will complain about duplicate device names and fail.

Will you apply this patch?