Re: [PATCH v2 0/2] Replace and improve "mcsafe" with copy_safe()

From: Linus Torvalds
Date: Fri May 01 2020 - 14:28:59 EST


On Thu, Apr 30, 2020 at 6:21 PM Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
>
> However now I see that copy_user_generic() works for the wrong reason.
> It works because the exception on the source address due to poison
> looks no different than a write fault on the user address to the
> caller, it's still just a short copy. So it makes copy_to_user() work
> for the wrong reason relative to the name.

Right.

And it won't work that way on other architectures. On x86, we have a
generic function that can take faults on either side, and we use it
for both cases (and for the "in_user" case too), but that's an
artifact of the architecture oddity.

In fact, it's probably wrong even on x86 - because it can hide bugs -
but writing those things is painful enough that everybody prefers
having just one function.

That's particularly true since that "one function" is actually a whole
family of functions - just for different optimizations.

Plus on x86 you can't reasonably even have different code sequences
for that case, because CLAC/STAC don't have a "enable users read
accesses" vs "write accesses" case. It's an all-or-nothing "enable
user faults".

We _used_ to have a difference on x86, back when we did the whole "fs
segment points to user space".

But on other architectures, there really is a difference between
"copy_to_user()" and "copy_from_user()", and the functions won't do
fault handling for the kernel side accesses.

> How about, following your suggestion, introduce copy_mc_to_user() (can
> just use copy_user_generic() internally) and copy_mc_to_kernel() for
> the other the helpers that the copy_to_iter() implementation needs?

Yes. That at least solves my naming worries, and is conceptually
something that works on other architectures.

Those other architectures may not have nvdimm support yet, but I think
everybody is at least looking at it.

And I really do think it will make the users more readable too, when
you see on a source level that "oh, this code is expecting that it
could take a poison fault/machine check on the source/destination".

> Following Jann's ex_handler_uaccess() example I could arrange for
> copy_mc_to_kernel() to use a new _ASM_EXTABLE_MC() to validate that
> the only type of exception meant to be handled is MC and warn
> otherwise?

That may be a good idea, but won't work for any shared case.

IOW, it would be lovely to have a "copy_mc_to_user()" check that if
it's a write fault, it's because it's a user address (and if it's a
read fault it's because it's a poisoned page or mc or whatever, but a
valid kernel address).

But it's exactly the kind of thing that we currently don't do even for
the bog-standard "copy_to_user()", because we share all the code
because we're lazy.

And as DavidL pointed out - if you ever have "iomem" as a source or
destination, you need yet another case. Not because they can take
another kind of fault (although on some platforms you have the machine
checks for that too), but because they have *very* different
performance profiles (and the ERMS "rep movsb" sucks baby donkeys
through a straw).

Linus