Re: [PATCH v1] mm/gup: fix FOLL_FORCE COW security issue and remove FOLL_COW
From: Linus Torvalds
Date: Tue Aug 09 2022 - 16:03:03 EST
On Mon, Aug 8, 2022 at 12:32 AM David Hildenbrand <david@xxxxxxxxxx> wrote:
>
So I've read through the patch several times, and it seems fine, but
this function (and the pmd version of it) just read oddly to me.
> +static inline bool can_follow_write_pte(pte_t pte, struct page *page,
> + struct vm_area_struct *vma,
> + unsigned int flags)
> +{
> + if (pte_write(pte))
> + return true;
> + if (!(flags & FOLL_FORCE))
> + return false;
> +
> + /*
> + * See check_vma_flags(): only COW mappings need that special
> + * "force" handling when they lack VM_WRITE.
> + */
> + if (vma->vm_flags & VM_WRITE)
> + return false;
> + VM_BUG_ON(!is_cow_mapping(vma->vm_flags));
So apart from the VM_BUG_ON(), this code just looks really strange -
even despite the comment. Just conceptually, the whole "if it's
writable, return that you cannot follow it for a write" just looks so
very very strange.
That doesn't make the code _wrong_, but considering how many times
this has had subtle bugs, let's not write code that looks strange.
So I would suggest that to protect against future bugs, we try to make
it be fairly clear and straightforward, and maybe even a bit overly
protective.
For example, let's kill the "shared mapping that you don't have write
permissions to" very explicitly and without any subtle code at all.
The vm_flags tests are cheap and easy, and we could very easily just
add some core ones to make any mistakes much less critical.
Now, making that 'is_cow_mapping()' check explicit at the very top of
this would already go a long way:
/* FOLL_FORCE for writability only affects COW mappings */
if (!is_cow_mapping(vma->vm_flags))
return false;
but I'd actually go even further: in this case that "is_cow_mapping()"
helper to some degree actually hides what is going on.
So I'd actually prefer for that function to be written something like
/* If the pte is writable, we can write to the page */
if (pte_write(pte))
return true;
/* Maybe FOLL_FORCE is set to override it? */
if (flags & FOLL_FORCE)
return false;
/* But FOLL_FORCE has no effect on shared mappings */
if (vma->vm_flags & MAP_SHARED)
return false;
/* .. or read-only private ones */
if (!(vma->vm_flags & MAP_MAYWRITE))
return false;
/* .. or already writable ones that just need to take a write fault */
if (vma->vm_flags & MAP_WRITE)
return false;
and the two first vm_flags tests above are basically doing tat
"is_cow_mapping()", and maybe we could even have a comment to that
effect, but wouldn't it be nice to just write it out that way?
And after you've written it out like the above, now that
if (!page || !PageAnon(page) || !PageAnonExclusive(page))
return false;
makes you pretty safe from a data sharing perspective: it's most
definitely not a shared page at that point.
So if you write it that way, the only remaining issues are the magic
special soft-dirty and uffd ones, but at that point it's purely about
the semantics of those features, no longer about any possible "oh, we
fooled some shared page to be writable".
And I think the above is fairly legible without any subtle cases, and
the one-liner comments make it all fairly clear that it's testing.
Is any of this in any _technical_ way different from what your patch
did? No. It's literally just rewriting it to be a bit more explicit in
what it is doing, I think, and it makes that odd "it's not writable if
VM_WRITE is set" case a bit more explicit.
Hmm?
Linus