Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

From: Leon Romanovsky
Date: Sat Mar 20 2021 - 07:54:27 EST


On Fri, Mar 19, 2021 at 10:23:13AM -0600, Alex Williamson wrote:
> On Fri, 19 Mar 2021 14:59:47 +0200
> Leon Romanovsky <leon@xxxxxxxxxx> wrote:
>
> > On Thu, Mar 18, 2021 at 07:34:56PM +0100, Enrico Weigelt, metux IT consult wrote:
> > > On 18.03.21 18:22, Leon Romanovsky wrote:
> > >
> > > > Which email client do you use?
> > > > Your responses are grouped as one huge block without any chance to respond
> > > > to you on specific point or answer to your question.
> > >
> > > I'm reading this thread in Tbird, and threading / quoting all looks
> > > nice.
> >
> > I'm not talking about threading or quoting but about response itself.
> > See it here https://lore.kernel.org/lkml/20210318103935.2ec32302@xxxxxxxxxxxxxxxxxxxxx/
> > Alex's response is one big chunk without any separations to paragraphs.
>
> I've never known paragraph breaks to be required to interject a reply.

Of course not, but as Bjorn said if you don't do paragraphs, we will
need manually break your message, fix ">" quotation marks and half
sentences.

I just wanted to be sure that this is not my mail client.

>
> Back on topic...
>
> > >
> > > > I see your flow and understand your position, but will repeat my
> > > > position. We need to make sure that vendors will have incentive to
> > > > supply quirks.
>
> What if we taint the kernel or pci_warn() for cases where either all
> the reset methods are disabled, ie. 'echo none > reset_method', or any
> time a device specific method is disabled?

What does it mean "none"? Does it mean nothing supported? If yes, I think that
pci_warn() will be enough. At least for me, taint is usable during debug stages,
probably if device doesn't crash no one will look to see /proc/sys/kernel/tainted.

>
> I'd almost go so far as to prevent disabling a device specific reset
> altogether, but for example should a device specific reset that fixes
> an aspect of FLR behavior prevent using a bus reset? I'd prefer in that
> case if direct FLR were disabled via a device flag introduced with the
> quirk and the remaining resets can still be selected by preference.

I don't know enough to discuss the PCI details, but you raised good point.
This sysfs is user visible API that is presented as is from device point
of view. It can be easily run into problems if PCI/core doesn't work with
user's choice.

>
> Theoretically all the other reset methods work and are available, it's
> only a policy decision which to use, right?

But this patch was presented as a way to overcome situations where
supported != working and user magically knows which reset type to set.

If you want to take this patch to be policy decision tool,
it will need to accept "reset_type1,reset_type2,..." sort of input,
so fallback will work natively.

I think that it will be much more robust and cleaner solution than it is now.
Something like that:
cat /sys/..../reset_policy
reset_type1,reset_type2,...,reset_typeX
echo "reset_type3,reset_type1" > /sys/..../reset_policy
cat /sys/..../reset_policy
reset_type3,reset_type1

Thanks