Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

From: Yu Zhao
Date: Tue Dec 22 2020 - 19:25:57 EST


On Tue, Dec 22, 2020 at 04:01:45PM -0800, Linus Torvalds wrote:
> On Tue, Dec 22, 2020 at 3:50 PM Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > See zap_pte_range() for an example of doing it right, even in the
> > presence of complexities (ie that has an example of both flushing the
> > TLB, and doing the actual "free the pages after flush", and it does
> > the two cases separately).
>
> The more I look at the mprotect code, the less I like it. We seem to
> be much better about the TLB flushes in other places (looking at
> mremap, for example). The mprotect code seems to be very laissez-faire
> about the TLB flushing.
>
> Does adding a TLB flush to before that
>
> pte_unmap_unlock(pte - 1, ptl);
>
> fix things for you?

It definitely does. But if I had to choose, I'd go with holding
mmap_lock for write because 1) it's less likely to storm other CPUs by
IPI and would only have performance impact on processes that use ufd,
which I guess already have high tolerance for not-so-good performance,
and 2) people are spearheading multiple efforts to reduce the mmap_lock
contention, which hopefully would make ufd users suffer less soon.

> That's not the right fix - leaving a stale TLB entry around is fine if
> the TLB entry is more strict wrt protections - but it might be worth
> testing as a "does it at least close the problem" patch.

Well, things get trick if we do this. I'm not sure if I could vouch
such a fix for stable as confident as I do holding mmap_lock for
write.