Re: interaction of MADV_PAGEOUT with CoW anonymous mappings?

From: Jann Horn
Date: Tue Mar 10 2020 - 15:12:14 EST


On Tue, Mar 10, 2020 at 7:48 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote:
> On Tue 10-03-20 19:08:28, Jann Horn wrote:
> > Hi!
> >
> > >From looking at the source code, it looks to me as if using
> > MADV_PAGEOUT on a CoW anonymous mapping will page out the page if
> > possible, even if other processes still have the same page mapped. Is
> > that correct?
> >
> > If so, that's probably bad in environments where many processes (with
> > different privileges) are forked from a single zygote process (like
> > Android and Chrome), I think? If you accidentally call it on a CoW
> > anonymous mapping with shared pages, you'll degrade the performance of
> > other processes. And if an attacker does it intentionally, they could
> > use that to aid with exploiting race conditions or weird
> > microarchitectural stuff (e.g. the new https://lviattack.eu/lvi.pdf
> > talks about "the assumption that attackers can provoke page faults or
> > microcode assists for (arbitrary) load operations in the victim
> > domain").
> >
> > Should madvise_cold_or_pageout_pte_range() maybe refuse to operate on
> > pages with mapcount>1, or something like that? Or does it already do
> > that, and I just missed the check?
>
> I have brought up side channel attacks earlier [1] but only in the
> context of shared page cache pages. I didn't really consider shared
> anonymous pages to be a real problem. I was under impression that CoW
> pages shouldn't be a real problem because any security sensible
> applications shouldn't allow untrusted code to be forked and CoW
> anything really important. I believe we have made this assumption
> in other places - IIRC on gup with FOLL_FORCE but I admit I have
> very happily forgot most details.

Android has a "zygote" process that starts up the whole Java
environment with a bunch of libraries before entering into a loop that
fork()s off a child every time the user wants to launch an app. So all
the apps, and even browser renderer processes, on the device share
many CoW VMAs. See
<https://developer.android.com/topic/performance/memory-overview#SharingRAM>.

I think Chrome on Linux desktop systems also forks off renderers from
a common zygote process after initializing libraries and so on. See
<https://chromium.googlesource.com/chromium/src.git/+/master/docs/linux/zygote.md>.
(But they use a relatively strict seccomp sandbox that e.g. doesn't
permit MADV_PAGEOUT.)