RE: [PATCHV2 3/3] x86, ras: Add mcsafe_memcpy() function to recover from machine checks

From: Luck, Tony
Date: Fri Dec 11 2015 - 16:19:22 EST


> I still don't get the BIT(63) thing. Can you explain it?

It will be more obvious when I get around to writing copy_from_user().

Then we will have a function that can take page faults if there are pages
that are not present. If the page faults can't be fixed we have a -EFAULT
condition. We can also take machine checks if we reads from a location with an
uncorrected error.

We need to distinguish these two cases because the action we take is
different. For the unresolved page fault we already have the ABI that the
copy_to/from_user() functions return zero for success, and a non-zero
return is the number of not-copied bytes.

So for my new case I'm setting bit63 ... this is never going to be set for
a failed page fault.

copy_from_user() conceptually will look like this:

int copy_from_user(void *to, void *from, unsigned long n)
{
u64 ret = mcsafe_memcpy(to, from, n);

if (COPY_HAD_MCHECK(r)) {
if (memory_failure(COPY_MCHECK_PADDR(ret) >> PAGE_SIZE, ...))
force_sig(SIGBUS, current);
return something;
} else
return ret;
}

-Tony