Re: [PATCH v7 4/8] PCI/sysfs: Allow userspace to query and set device reset mechanism

From: Bjorn Helgaas
Date: Thu Jun 24 2021 - 12:56:05 EST


On Thu, Jun 24, 2021 at 08:42:42PM +0530, Amey Narkhede wrote:
> On 21/06/24 07:15AM, Bjorn Helgaas wrote:
> > On Tue, Jun 08, 2021 at 11:18:53AM +0530, Amey Narkhede wrote:
> > > Add reset_method sysfs attribute to enable user to
> > > query and set user preferred device reset methods and
> > > their ordering.
> >
> > > + Writing the name or comma separated list of names of any of
> > > + the device supported reset methods to this file will set the
> > > + reset methods and their ordering to be used when resetting
> > > + the device.
> >
> > > + while ((name = strsep(&options, ",")) != NULL) {
> > > + if (sysfs_streq(name, ""))
> > > + continue;
> > > +
> > > + name = strim(name);
> > > +
> > > + for (i = 0; i < PCI_RESET_METHODS_NUM; i++) {
> > > + if (reset_methods[i] &&
> > > + sysfs_streq(name, pci_reset_fn_methods[i].name)) {
> > > + reset_methods[i] = prio--;
> > > + break;
> > > + }
> > > + }
> > > +
> > > + if (i == PCI_RESET_METHODS_NUM) {
> > > + kfree(options);
> > > + return -EINVAL;
> > > + }
> > > + }
> >
> > Asking again since we didn't get this clarified before. The above
> > tells me that "reset_methods" allows the user to control the
> > *order* in which we try reset methods.
> >
> > Consider the following two uses:
> >
> > (1) # echo bus,flr > reset_methods
> >
> > (2) # echo flr,bus > reset_methods
> >
> > Do these have the same effect or not?
> >
> They have different effect.

I asked about this because Shanker's idea [1] of using two bitmaps
only keeps track of which resets are *enabled*. It does not keep
track of the *ordering*. Since you want to control the ordering, I
think we need more state than just the supported/enabled bitmaps.

> > If "reset_methods" allows control over the order, I expect them to
> > be different: (1) would try a bus reset and, if the bus reset
> > failed, an FLR, while (2) would try an FLR and, if the FLR failed,
> > a bus reset.
>
> Exactly you are right.
>
> Now the point I was presenting was with new encoding we have to
> write list of *all of the supported reset methods* in order for
> example, in above example flr,bus or bus,flr. We can't just write
> 'flr' or 'bus' then switch back to writing flr,bus/bus,flr (these
> have different effect as mentioned earlier).

It sounds like you're saying this sequence can't work:

# echo flr > reset_methods
# echo bus,flr > reset_methods

But I'm afraid you'll have to walk me through the reasons why this
can't be made to work.

> Basically with new encoding an user can't write subset of reset
> methods they have to write list of *all* supported methods
> everytime.

Why does the user have to write all supported methods? Is that to
preserve the fact that "cat reset_methods" always shows all the
supported methods so the user knows what's available?

I'm wondering why we can't do something like this (pidgin code):

if (option == "default") {
pci_init_reset_methods(dev);
return;
}

n = 0;
foreach method in option {
i = lookup_reset_method(method);
if (pci_reset_methods[i].reset_fn(dev, PROBE) == 0)
dev->reset_methods[n++] = i; # method i supported
}
dev->reset_methods[n++] = 0; # end of supported methods

If we did something like the above, the user could always find the
list of all methods supported by a device by doing this:

# echo default > reset_methods
# cat reset_methods

Yes, this does call the "probe" methods several times. I don't think
that's necessarily a problem.

Bjorn

[1] https://lore.kernel.org/r/1fb0a184-908c-5f98-ef6d-74edc602c2e0@xxxxxxxxxx