Re: [PATCH] mmc: core: Don't allocate IDA for OF aliases

From: Ulf Hansson
Date: Fri Apr 16 2021 - 03:17:52 EST


On Thu, 15 Apr 2021 at 21:29, Stephen Boyd <swboyd@xxxxxxxxxxxx> wrote:
>
> Quoting Ulf Hansson (2021-04-15 01:56:12)
> > On Tue, 13 Apr 2021 at 02:36, Stephen Boyd <swboyd@xxxxxxxxxxxx> wrote:
> > >
> > > - err = ida_simple_get(&mmc_host_ida, min_idx, max_idx, GFP_KERNEL);
> > > - if (err < 0) {
> > > - kfree(host);
> > > - return NULL;
> > > + index = ida_simple_get(&mmc_host_ida, min_idx, max_idx, GFP_KERNEL);
> > > + if (index < 0) {
> > > + kfree(host);
> > > + return NULL;
> > > + }
> >
> > This means that a DTB that is screwed up in a way that it has two mmc
> > aliases with the same index, would be allowed to use the same index.
> >
> > What will happen when we continue the probe sequence in such a case?
>
> Yeah I thought about this after sending the patch. So the problem would
> be like this right?
>
> aliases {
> mmc1 = &sdhci0;
> mmc1 = &sdhci1;
> };

Correct.

>
> I have good news! DT won't compile it because it saw the same alias
> assigned to twice. I tried it on my sc7180 board.
>
> arch/arm64/boot/dts/qcom/sc7180.dtsi:35.3-18:
> ERROR (duplicate_property_names): /aliases:mmc1: Duplicate property name
> ERROR: Input tree has errors, aborting (use -f to force output)
> arch/arm64/boot/dts/qcom/sc7180-idp.dtb] Error 2
>
> I suppose if someone forced the compilation it may be bad, but do we
> really care?
>
> TL;DR: this seems like it isn't a problem.

Yep, I definitely tend to agree with you here. Thanks for doing the
test and sharing the result.

>
> >
> > > }
> > >
> > > - host->index = err;
> > > + host->index = index;
> > >
> > > dev_set_name(&host->class_dev, "mmc%d", host->index);
> > > host->ws = wakeup_source_register(NULL, dev_name(&host->class_dev));
> >
> > Another concern that could potentially be a problem, is that the
> > "thread" that holds the reference that prevents ida from being
> > removed, how would that react to a new mmc device to become
> > re-registered with the same index?
> >
> > I wonder if we perhaps should return -EPROBE_DEFER instead, when
> > ida_simple_get() fails?
> >
>
> Don't think so. The device (with the kobject inside) is removed, and
> thus the mmc1 device will be removed, but the kobject's release function
> is delayed due to the config. This means that
> mmc_host_classdev_release() is called at a later time. The only thing
> inside that function is the IDA removal and the kfree of the container
> object. Given that nothing else is in that release function I believe it
> is safe to skip IDA allocation as it won't be blocking anything in the
> reserved alias case.
>
> Furthermore, when the device is deleted in mmc_remove_host() there could
> be other users of the device that haven't called put_device() yet.
> Either way, those other users are keeping the device memory alive, but
> otherwise device_del() has unlinked it from the various driver core
> lists and sysfs has removed it too so it's in a state where code may be
> referencing it but it's on the way out so users of the device will not
> be able to do much with it during this time.

Right, but see more below.

>
> This sort of problem (if it exists which I don't think it does) would
> have been there all along and can't be fixed at this level. When a
> device that has an alias calls the mmc_alloc_host() function twice it
> gets two different device structures created so there are two distinct
> kobjects that will need to be released at some point. The index is
> usually different for those two kobjects, but with aliases it turns out
> it is the same. When it comes to registering that device with the same
> name the second one will fail because a device with that name already
> exists on the bus. This would be really hard to do given that it would
> need to be the same aliased device in DT calling the mmc_add_host()
> function without calling mmc_remove_host() for the first one it added in
> between.

In fact, we have a few rare corner cases that can trigger KASAN splats
when mmc_remove_host() gets executed. Similar splats can be triggered
by just doing a sudden card removal. It's especially related to the
cases, where a thread holds a reference to the card/host object when
it's being removed. I am working on patches to fix these cases, but
haven't yet decided on the best solution.

That's the reason why I was thinking that maybe returning
-EPROBE_DEFER could be an option, but certainly I need to think more
about this.

>
> (Sorry if that is long. I'm sort of stream of conciousness writing it to
> you here and not rewriting it to be more concise)

No worries, thanks a lot for sharing your thoughts!

Kind regards
Uffe