Re: [PATCH v1 1/1] Revert "pinctrl: avoid unsafe code pattern in find_pinctrl()"

From: Andy Shevchenko
Date: Thu Oct 19 2023 - 08:13:46 EST


On Wed, Oct 18, 2023 at 03:41:24PM -0700, Dmitry Torokhov wrote:
> On Wed, Oct 18, 2023 at 08:01:23AM +0300, Andy Shevchenko wrote:
> > On Tue, Oct 17, 2023 at 02:43:01PM -0700, Dmitry Torokhov wrote:
> > > On Tue, Oct 17, 2023 at 10:45:39PM +0300, Andy Shevchenko wrote:
> >
> > Thanks for your response.

...

> > > I wonder, could you please post entire dmesg for your system?
> >
> > Working, non-working or both?
>
> Non working, especially if you also enable debug logs in
> drivers/mmc/host/sdhci-pci-core.c.
>
> What I do not quite understand is that I think we should not be hitting
> the case where pinctrl is already created for the device, which is the
> code path my patch was changing. IIUIC we should be mostly executing the
> "pinctrl not found" path and that did not really change. Maybe you could
> also put some more annotations to show how/at what exact point the probe
> order changed? Maybe log find_pinctrl() calls and compare?
>
> Linus, BTW, I think there are more problems there with pinctrl lookup,
> because, if we assume there are concurrent accesses to pinctrl_get(),
> the fact that we did not find an instance while scanning the list does
> not mean we will not find it when we go to insert a newly created one.
>
> Another problem, as far as I can see, that there is not really a defined
> owner of pinctrl structure, it is created on demand, and destroyed when
> last user is gone. So if we execute last pintctrl_put() and there is
> another pinctrl_get() running simultaneously, we may get and bump up the
> refcount, and then release (pinctrl_free) will acquire the mutex, and
> zap the structure.

Oh, that's a lot of fixing ahead! But if you send anything to test, I would
happy do it.

> Given that there are more issues in that code, maybe we should revert
> the patch for now so Andy has a chance to convert to UUID/LABEL booting?

I believe it's not feasible, see below why.

...

> > > I think the right answer is "fix the userspace" really in this case. We
> > > could also try extend of_alias_get_id() to see if we could pass some
> > > preferred numbering on x86. But this will again be fragile if the
> > > knowledge resides in the driver and is not tied to a particular board
> > > (as it is in DT case): there could be multiple controllers, things will
> > > be shifting board to board...
> >
> > Any suggestion how should it be properly done in the minimum shell environment?
> > (Busybox uses mdev with static tables IIRC and there is no fancy udev or so)
>
> I'm not sure, so you have something like blkid running? You just need to
> locate the device and chroot there. This assumes you do have initramfs.

I don't think this is working solution.

My case is: I have build an environment with a script that has hardcoded
mmcblk0 to mount from. When I run this script I do not know _which_ exact
board I run on, it should work on any of them (same boards, but different
UUIDs).

While writing this I realised that the common denominator I have here is
the physical device (as it's a PCI one), and it's on-SoC, so can't change
its BDF. So, there seem to be a solution. Let me try to implement that.

--
With Best Regards,
Andy Shevchenko