Re: [PATCH v18 00/19] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers
From: Mauro Carvalho Chehab
Date: Tue Jan 14 2025 - 09:08:15 EST
Em Mon, 13 Jan 2025 15:36:39 +0000
Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> escreveu:
> > >
> > > 5. CXL features driver supporting ECS control feature.
> > > 6. ACPI RAS2 driver adds OS interface for RAS2 communication through
> > > PCC mailbox and extracts ACPI RAS2 feature table (RAS2) and
> > > create platform device for the RAS memory features, which binds
> > > to the memory ACPI RAS2 driver.
> > > 7. Memory ACPI RAS2 driver gets the PCC subspace for communicating
> > > with the ACPI compliant platform supports ACPI RAS2. Add callback
> > > functions and registers with EDAC device to support user to
> > > control the HW patrol scrubbers exposed to the kernel via the
> > > ACPI RAS2 table.
> > > 8. Support for CXL maintenance mailbox command, which is used by
> > > CXL device memory repair feature.
> > > 9. CXL features driver supporting PPR control feature.
> > > 10. CXL features driver supporting memory sparing control feature.
> > > Note: There are other PPR, memory sparing drivers to come.
> >
> > The text above should be inside Documentation, and not on patch 0.
> >
> > A big description like that makes hard to review this series. It is
> > also easier to review the text after having it parsed by kernel doc
> > build specially for summary tables like the "Comparison of scrubbing
> > features", which deserves ReST links processed by Sphinx to the
> > corresponding definitions of the terms that are be compared there.
>
> Whilst I fully agree that having a huge cover letter makes for a burden
> for any reviewer coming to the series, this is here at specific request
> of reviewers.
Ok, then. Yet, even for them it would be very hard to track what
changes from v19 to the next versions if you change something at
patch 00.
> We can look at keeping more of it in documentation though
> it's a bit white paper like in comparison with what I'd normally expect
> to see in kernel documentation.
Personally, I like comprehensive documentation at the Kernel.
> >
> > > Open Questions based on feedbacks from the community:
> > > 1. Leo: Standardize unit for scrub rate, for example ACPI RAS2 does not define
> > > unit for the scrub rate. RAS2 clarification needed.
> >
> > I noticed the same when reviewing a patch series for rasdaemon. Ideally,
> > ACPI requires an errata defining what units are expected for scrub rate.
>
> There is a code first ACPI ECN that indeed adds units. That is accepted
> for next ACPI specification release.
>
> Seems the tianocore bugzilla is unhelpfully down for a migration
> but it should be id 1013 at bugzilla.tianocore.com
>
> That adds a detailed description of what the scrub rate settings mean but
> we may well still have older platforms where the scaling is arbitrary.
> The units defined are sufficient to map to whatever presentation we like.
>
> > While ACPI doesn't define it, better to not add support for it - or be
> > conservative using a low granularity for it (like using minutes instead
> > of hours).
>
> I don't mind changing this, though for systems we are aware of default scrub
> is typically once or twice in 24 hours.
Yes, I noticed that we're using seconds after reading other patches.
It sounds OK to me to keep it as-is.
It is really unlikely that we would ever have scrubbing finishing in less
than a second.
Thanks,
Mauro