Re: [PATCH] [iov_iter] use memmove() when copying to/from user page
From: Dmitry Vyukov
Date: Tue May 16 2017 - 16:11:22 EST
On Tue, May 16, 2017 at 12:37 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
>> >> It's possible that calling sendfile() to copy the data from a memfd to
>> >> itself may result in doing a memcpy() with overlapping arguments.
>> >> To avoid undefined behavior here, replace memcpy() with memmove() and
>> >> rename memcpy_to_page()/memcpy_from_page() accordingly.
>> >
>> > Er... And what semantics would you assign to such sendfile()? I really
>> > want to see details, because it sounds like memmove() here will not be
>> > any more useful than memcpy() - you still can esily get odd behaviour.
>>
>>
>> What odd behavior can we get with memmove?
>>
>> Case that I am thinking of is when you want to delete part of the file
>> in the middle. To do that you move tail of the file and then truncate.
>> Memmove will do the intended thing. While memcpy can lost of data and
>> duplicate another.
>
> Oh, lovely. While we are trading idiotic use cases - what about inserting
> something in the middle of a file? No? Why is it any different?
I never used this. But moving a part of file does not look completely idiotic.
What do you mean about inserting into the middle? How is it related?
> There are two sides to it:
> * real nasal demons resulting from that memcpy() with overlapping
> source and destination - as in, "it not only trashed the page contents,
> it has led to memory corruption/leaked data/etc". Any such would be a real
> problem.
Let's put this aside.
> * behaviour of sendfile() in such a case. And there I've no problem
> with saying "contents after operation is undefined". If you wish to change
> that, by all means start with documenting the semantics you want to promise
> to userland.
I would say it's already documented.
sendfile says that it "copies data". memmove says that it "copies
data". memcpy says that it "copies data, but data must not overlap".
sendfile does not say that "data must not overlap".
But in the end I don't understand why you are so opposed to this change.
Arguing about saving a single branch can make sense on the very low
level. When if we are talking about syscalls and syscall semantics, a
single branch does not matter while providing higher quality of
implementation and following the principle of least surprise does
matter.
Let's imagine we have memmove there now, would you argue strongly that
we need to change it to memcpy?