Re: [PATCH 0/6] use memcpy_mcsafe() for copy_to_iter()

From: Dan Williams
Date: Tue May 01 2018 - 19:03:11 EST


On Tue, May 1, 2018 at 2:05 PM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Tue, May 1, 2018 at 1:55 PM Dan Williams <dan.j.williams@xxxxxxxxx>
> wrote:
>
>> The result of the bypass is that the kernel treats machine checks during
>> read as system fatal (reboot) when they could simply be flagged as an
>> I/O error, similar to performing reads through the pmem driver. Prevent
>> this fatal condition by deploying memcpy_mcsafe() in the fsdax read
>> path.
>
> How about just changing the rules, and go the old "Don't do that then" way?
>
> IOW, get rid of the whole idea that MCS errors should be fatal. It's wrong
> and pointless anyway.
>
> The while approach seems fundamentally buggered, if you ever want to mmap
> one of these things. And don't you want that?
>
> So why continue down a fundamentally broken path?

I'm confused. Are you talking about getting rid of the block-layer
bypass or changing how MCS errors are handled? If it's the former I've
gotten push back in the past trying to remove the bypass, but I feel
better about my chances to slay that beast wielding the +5 Hammer of
Linus. If it's the latter, MCS error handling, I don't see how get
around something like copy_to_iter_mcsafe().

You mention mmap. Yes, we want the predominant access model to be
dax-mmap for Persistent Memory, but there's still the question about
what to do with media errors. To date we are trying to mirror the
error handling model for System Memory, i.e. SIGBUS to the process
that consumed the error. Is that error handling model also problematic
in your view?