Re: [PATCH 2/3] cxl/mbox: Add GET_POISON_LIST mailbox command support
From: Alison Schofield
Date: Thu Jun 16 2022 - 18:11:10 EST
David - you make lots of good points, one quick comments at end...
On Thu, Jun 16, 2022 at 02:47:40PM -0700, Davidlohr Bueso wrote:
> On Thu, 16 Jun 2022, Alison Schofield wrote:
> >I'm headed in this direction -
>
> I like these interfaces, btw.
>
> >cxl list --media-errors -m mem1
> > lists media errors for requested memdev
>
> But in this patchset you're only listing for persistent configurations.
> So if there is a volatile partion, or the whole device is volatile,
> this would not consider that.
>
> So unless I'm missing something, we need to consider ram_range as well.
>
> >cxl list --media-errors -r region#
> > lists region errors with HPA addresses
> > (So here cxl tool will collect the poison for all the regions
> > memdevs and do the DPA to HPA translation)
>
> I was indeed thinking along these lines. But similar to the above,
> the region driver also has plans to enumarate volatile regions
> configured by BIOS.
>
> >
> >To answer your question, I wasn't thinking of limiting
> >the range within the memdev, but certainly could. And if we were
> >taking in ranges, those ranges would need to be checked.
>
> My question was originally considering poisoning only within pmem DPA
> ranges, but now I'm wondering if all this also applies equally to volatile
> parts as well... Reading the spec I interpret both, but reading the
> T3 Memory Device Software Guide '2.13.19' it only mentions persistent
> capacity.
>
> >
> >$cxl list --media-errors -m mem1 --range-start= --range-end|len=
>
> I figure this kind of like the above with regions being very arbitrary
> and dynamic.
>
> >Now, if I left the sysfs interface as is, the driver will read the
> >entire poison list for the memdev and then cxl tool will filter it
> >for the range requested.
> >
> >Or, maybe we should implement in libcxl (not sysfs), with memdev and
> >range options and only collect from the device the range requested.
>
> I wonder if the latter may be the better option considering that always
> scanning the entire memdev would cause unnecessary media scan wait times,
> specially for large capacities.
This is not a Media Scan. This is only reading the existing Poison List.
>
> Thanks,
> Davidlohr