Re: [dm-devel] [PATCH 0/6] dax poison recovery with RWF_RECOVERY_DATA flag

From: Jane Chu
Date: Fri Oct 22 2021 - 16:53:10 EST


On 10/21/2021 10:36 PM, Christoph Hellwig wrote:
> On Fri, Oct 22, 2021 at 01:37:28AM +0000, Jane Chu wrote:
>> On 10/21/2021 4:31 AM, Christoph Hellwig wrote:
>>> Looking over the series I have serious doubts that overloading the
>>> slow path clear poison operation over the fast path read/write
>>> path is such a great idea.
>>>
>>
>> Understood, sounds like a concern on principle. But it seems to me
>> that the task of recovery overlaps with the normal write operation
>> on the write part. Without overloading some write operation for
>> 'recovery', I guess we'll need to come up with a new userland
>> command coupled with a new dax API ->clear_poison and propagate the
>> new API support to each dm targets that support dax which, again,
>> is an idea that sounds too bulky if I recall Dan's earlier rejection
>> correctly.
>
> When I wrote the above I mostly thought about the in-kernel API, that
> is use a separate method. But reading your mail and thinking about
> this a bit more I'm actually less and less sure that overloading
> pwritev2 and preadv2 with this at the syscall level makes sense either.
> read/write are our I/O fast path. We really should not overload the
> core of the VFS with error recovery for a broken hardware interface.
>

Thanks - I try to be honest. As far as I can tell, the argument
about the flag is a philosophical argument between two views.
One view assumes design based on perfect hardware, and media error
belongs to the category of brokenness. Another view sees media
error as a build-in hardware component and make design to include
dealing with such errors.

Back when I was fresh out of school, a senior engineer explained
to me about media error might be caused by cosmic ray hitting on
the media at however frequency and at whatever timing. It's an
argument that media error within certain range is a fact of the product,
and to me, it argues for building normal software component with
errors in mind from start. I guess I'm trying to articulate why
it is acceptable to include the RWF_DATA_RECOVERY flag to the
existing RWF_ flags. - this way, pwritev2 remain fast on fast path,
and its slow path (w/ error clearing) is faster than other alternative.
Other alternative being 1 system call to clear the poison, and
another system call to run the fast pwrite for recovery, what
happens if something happened in between?

thanks!
-jane