Re: [RFC PATCH v8 01/10] ras: scrub: Add scrub subsystem

From: Jonathan Cameron
Date: Fri May 17 2024 - 07:16:16 EST



Focusing on just one bit.

> > Now, the question of how many legacy scrub interfaces should be
> > considered in this design out of the gate is a worthwhile discussion. I
> > am encouraged that this ABI is at least trying to handle more than 1
> > backend, which makes me feel better that adding a 3rd and 4th might not
> > be prohibitive.
>
> See above.
>
> I'm perfectly fine with: "hey, we have a new scrub API interfacing to
> RAS scrub capability and it is *the* thing to use and all other hw scrub
> functionality should be shoehorned into it.
>
> So this thing's design should at least try to anticipate supporting
> other scrub hw.
>
> Because there's EDAC too. Why isn't this scrub thing part of EDAC? Why
> isn't this scrub API part of edac_core? I mean, this is all RAS so why
> design a whole new thing when the required glue is already there?
>
> We can just as well have a
>
> /sys/devices/system/edac/scrub/
>
> node hierarchy and have everything there.

A few questions about this. It seems an unusual use fake devices and a bus
so I'm trying to understand how we might do something that looks more standard
but perhaps also fit within the existing scheme. I appreciate this stuff
has evolved over a long time, so lots of backwards compatibility concerns.

If I follow this right the current situation is:

/sys/devices/system/edac is the 'virtual' device registered on the edac bus.

>
> Why does it have to be yet another thing?
>
> And if it needs to be separate, who's going to maintain it?
>
> > Which matches what I reacted to on the last posting:
> >
> > "Maybe it is self evident to others, but for me there is little in these
> > changelogs besides 'mechanism exists, enable it'"
> >
> > ...and to me that feedback was taken to heart with much improved
> > changelogs in this new posting.
>
> Ok.
>
> > This init time feature probing discussion feels like it was born from a
> > micommunication / misunderstanding.
>
> Yes, it seems so, thanks for clarifying things.
>
> I still am unclear on the usecases and how this is supposed to be used
> and also, as mentioned above, we have a *lot* of RAS functionality
> spread around the kernel. Perhaps we should start unifying it instead of
> adding more...
>
> So the big picture and where we're headed to, needs to be clarified first.
>
> Thx.
>