Re: [PATCH v13] x86, mce: Add memcpy_trap()
From: Andy Lutomirski
Date: Thu Feb 25 2016 - 20:20:13 EST
On Thu, Feb 25, 2016 at 4:58 PM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Thu, Feb 25, 2016 at 2:11 PM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>>
>> do_machine_check uses IST, the memory failure code can sleep, and you
>> can't sleep in IST context. There's a special escape that lets
>> memory_failure sleep *if* it came from user mode.
>
> So?
>
[...]
Then let's answer the API question instead of the implementation question.
If a user program accesses a bad virtual address directly, it gets
SIGSEGV or SIGBUS depending on the nature of the error. This is
long-established practice. The SIGSEGV case is programmer error and
the SIGBUS case might be an IO error.
If a user program accesses a bad virtual address by passing the
address to a syscall, it gets EFAULT. This may be programmer error or
and underlying IO error, and the program can't tell.
If a user program accesses a bad address on an NVDIMM via mmap, what
should happen? If mmaped NVDIMM (or other DAX space) is the same as
existing poisoned memory, the program gets SIGBUS. This still makes
sense.
The question here: what happens if a program accesses a bad NVDIMM
address by passing a pointer to a syscall? With Tony's patches as
written, I think the program gets SIGBUS via memory_failure. Do we
want that behavior? If we take your suggestion and change only the
error code, then the program will *not* get SIGBUS. Instead it will
get -EFAULT or -ESOMETHINGELSE. Is that okay? If it is, then
everything is straightforward and nothing in my previous email is
relevant.
--Andy