Re: [PATCH] nvme: remove multipath module parameter

From: Christoph Hellwig
Date: Wed Mar 05 2025 - 19:04:04 EST


On Wed, Mar 05, 2025 at 04:57:44PM -0700, Keith Busch wrote:
> > > Obviously he's not talking about multiported PCIe.
> >
> > Why is that obvious?
>
> No one here would think a multiported device *wouldn't* report CMIC.

I hopes so.

> The
> fact Hannes thinks that's a questionable feature for his device gives
> away that it is single ported.

Well, his quote reads like he doesn't know about multiport PCIe devices.
But maybe he just meant to say "despite being single-ported"

> > At least based on the stated works he talks about
> > PCIe and not about multi-port. The only not multiported devices I've
> > seen that report NMIC and CMIC are a specific firmware so that the
> > customer would get multipath behavior, which is a great workaround for
> > instable heavily switched fabrics. Note that multiported isn't always
> > obvious as there are quite a few hacks using lane splitting around that
> > a normal host can't really see.
>
> In my experience, it's left enabled because of SRIOV, which many of
> these devices end up shipping without supporting in PCI space anyway.

If a device supports SR-IO setting CMIC and NMIC is corret, but I've
actually seen surprisingly few production controllers actually supporting
SR-IOV despite what the datasheets say.

>
> > > And he's right, the
> > > behavior of a PCIe hot plug is very different and often undesirable when
> > > it's under native multipath.
> >
> > If you do actual hotplug and expect the device to go away it's indeed
> > not desirable. If you want the same device to come back after switched
> > fabric issues it is so desirable that people hack to devices to get it.
> > People talked about adding a queue_if_no_path-like parameter to control
> > keeping the multipath node alive a lot, but no one has ever invested
> > work into actually implementing it.
>
> Not quite the same thing, but kind of related: I proposed this device
> missing debounce thing about a year ago:
>
> https://lore.kernel.org/linux-nvme/Y+1aKcQgbskA2tra@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/

Yes, that somehow fell off the cliff.