Re: [PATCH 0/6] use memcpy_mcsafe() for copy_to_iter()
From: Dan Williams
Date: Tue May 01 2018 - 19:31:53 EST
On Tue, May 1, 2018 at 4:28 PM, Andy Lutomirski <luto@xxxxxxxxxx> wrote:
> On Tue, May 1, 2018 at 4:02 PM Dan Williams <dan.j.williams@xxxxxxxxx>
> wrote:
>
>> On Tue, May 1, 2018 at 2:05 PM, Linus Torvalds
>> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>> > On Tue, May 1, 2018 at 1:55 PM Dan Williams <dan.j.williams@xxxxxxxxx>
>> > wrote:
>> >
>> >> The result of the bypass is that the kernel treats machine checks
> during
>> >> read as system fatal (reboot) when they could simply be flagged as an
>> >> I/O error, similar to performing reads through the pmem driver. Prevent
>> >> this fatal condition by deploying memcpy_mcsafe() in the fsdax read
>> >> path.
>> >
>> > How about just changing the rules, and go the old "Don't do that then"
> way?
>> >
>> > IOW, get rid of the whole idea that MCS errors should be fatal. It's
> wrong
>> > and pointless anyway.
>> >
>> > The while approach seems fundamentally buggered, if you ever want to
> mmap
>> > one of these things. And don't you want that?
>> >
>> > So why continue down a fundamentally broken path?
>
>> I'm confused. Are you talking about getting rid of the block-layer
>> bypass or changing how MCS errors are handled? If it's the former I've
>> gotten push back in the past trying to remove the bypass, but I feel
>> better about my chances to slay that beast wielding the +5 Hammer of
>> Linus. If it's the latter, MCS error handling, I don't see how get
>> around something like copy_to_iter_mcsafe().
>
>> You mention mmap. Yes, we want the predominant access model to be
>> dax-mmap for Persistent Memory, but there's still the question about
>> what to do with media errors. To date we are trying to mirror the
>> error handling model for System Memory, i.e. SIGBUS to the process
>> that consumed the error. Is that error handling model also problematic
>> in your view?
>
> I'm not sure exactly what you mean here, but my understanding of the status
> quo is that memory errors in user code are non-fatal but that memory errors
> in kernel code are fatal unless there's an appropriate extable entry. The
> old iov_iter code assumes that memcpy() on kernel addresses can't fail.
> I'm not sure how else it could work.
Right, I'm trying to clarify the "IOW, get rid of the whole idea that
MCS errors should be fatal" comment. Especially as I am about to go
fix memory_failure() to understand that ZONE_DEVICE pages != typical
"struct page", and do the right thing with respect to un-mapping
userspace dax mapped pages.