Re: [PATCH v18 04/19] EDAC: Add memory repair control feature
From: Borislav Petkov
Date: Thu Jan 09 2025 - 10:21:54 EST
On Thu, Jan 09, 2025 at 02:24:33PM +0000, Jonathan Cameron wrote:
> To my thinking that would fail the test of being an intuitive interface.
> To issue a repair command requires that multiple attributes be configured
> before triggering the actual repair.
>
> Think of it as setting the coordinates of the repair in a high dimensional
> space.
Why?
You can write every attribute in its separate file and have a "commit" or
"start" file which does that.
Or you can designate a file which starts the process. This is how I'm
injecting errors on x86:
see readme_msg here: arch/x86/kernel/cpu/mce/inject.c
More specifically:
"flags:\t Injection type to be performed. Writing to this file will trigger a\n"
"\t real machine check, an APIC interrupt or invoke the error decoder routines\n"
"\t for AMD processors.\n"
So you set everything else, and as the last step you set the injection type
*and* you also trigger it with this one write.
> Sure. In this case the addition of min/max was perhaps a wrong response to
> your request for a way to those ranges rather than just rejecting a write
> of something out of range as earlier version did.
>
> We can revisit in future if range discovery becomes necessary. Personally
> I don't think it is given we are only taking these actions in response error
> records that give us precisely what to write and hence are always in range.
My goal here was to make this user-friendly. Because you need some way of
knowing what valid ranges are and in order to trigger the repair, if it needs
to happen for a range.
Or, you can teach the repair logic to ignore invalid ranges and "clamp" things
to whatever makes sense.
Again, I'm looking at it from the usability perspective. I haven't actually
needed this scrub+repair functionality yet to know whether the UI makes sense.
So yeah, collecting some feedback from real-life use cases would probably give
you a lot better understanding of how that UI should be designed... perhaps
you won't ever need the ranges, whow knows.
So yes, preemptively designing stuff like that "in the dark" is kinda hard.
:-)
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette