Re: [PATCH] acpi/nfit: badrange report spill over to clean range

From: Jane Chu
Date: Fri Jul 15 2022 - 13:38:38 EST

On 7/14/2022 5:58 PM, Dan Williams wrote:
>>>>> However, the ARS engine likely can return the precise error ranges so I
>>>>> think the fix is to just use the address range indicated by 1UL <<
>>>>> MCI_MISC_ADDR_LSB(mce->misc) to filter the results from a short ARS
>>>>> scrub request to ask the device for the precise error list.
>>>> You mean for nfit_handle_mce() callback to issue a short ARS per each
>>>> poison report over a 4K range
>>> Over a L1_CACHE_BYTES range...
>>> For the badrange tracking, no. So this would just be a check to say
>>> "Yes, CPU I see you think the whole 4K is gone, but lets double check
>>> with more precise information for what gets placed in the badrange
>>> tracking".
>> Okay, process-wise, this is what I am seeing -
>> - for each poison, nfit_handle_mce() issues a short ARS given (addr,
>> 64bytes)
> Why would the short-ARS be performed over a 64-byte span when the MCE
> reported a 4K aligned event?

Cuz you said so, see above. :) Yes, 4K range as reported by the MCE
makes sense.

>> - and short ARS returns to say that's actually (addr, 256bytes),
>> - and then nvdimm_bus_add_badrange() logs the poison in (addr, 512bytes)
>> anyway.
> Right, I am reacting to the fact that the patch is picking 512 as an
> arbtitrary blast radius. It's ok to expand the blast radius from
> hardware when, for example, recording a 64-byte MCE in badrange which
> only understands 512 byte records, but it's not ok to take a 4K MCE and
> trim it to 512 bytes without asking hardware for a more precise report.


> Recall that the NFIT driver supports platforms that may not offer ARS.
> In that case the 4K MCE from the CPU is all that the driver gets and
> there is no data source for a more precise answer.
> So the ask is to avoid trimming the blast radius of MCE reports unless
> and until a short-ARS says otherwise.

What happens to short ARS on a platform that doesn't support ARS?

>> The precise badrange from short ARS is lost in the process, given the
>> time spent visiting the BIOS, what's the gain?
> Generic support for not under-recording poison on platforms that do not
> support ARS.
>> Could we defer the precise badrange until there is consumer of the
>> information?
> Ideally the consumer is immediate and this precise information can make
> it to the filesystem which might be able to make a better decision about
> what data got clobbered.
> See dax_notify_failure() infrastructure currently in linux-next that can
> convey poison events to filesystems. That might be a path to start
> tracking and reporting precise failure information to address the
> constraints of the badrange implementation.

Yes, I'm aware of dax_notify_failure(), but would appreciate if you
don't mind to elaborate on how the code path could be leveraged for
precise badrange implementation.
My understanding is that dax_notify_failure() is in the path of
synchronous fault accompanied by SIGBUS with BUS_MCEERR_AR.
But badrange could be recorded without poison being consumed, even
without DAX filesystem in the picture.