Re: interaction of MADV_PAGEOUT with CoW anonymous mappings?

From: Michal Hocko
Date: Tue Mar 17 2020 - 03:12:45 EST


On Mon 16-03-20 18:43:40, Minchan Kim wrote:
> On Mon, Mar 16, 2020 at 10:20:52AM +0100, Michal Hocko wrote:
> > On Fri 13-03-20 13:59:41, Minchan Kim wrote:
> > > On Fri, Mar 13, 2020 at 09:05:46AM +0100, Michal Hocko wrote:
> > > > On Thu 12-03-20 19:08:51, Minchan Kim wrote:
> > > > > On Thu, Mar 12, 2020 at 09:41:55PM +0100, Michal Hocko wrote:
> > > > > > On Thu 12-03-20 13:16:02, Minchan Kim wrote:
> > > > > > > On Thu, Mar 12, 2020 at 09:22:48AM +0100, Michal Hocko wrote:
> > > > > > [...]
> > > > > > > > From eca97990372679c097a88164ff4b3d7879b0e127 Mon Sep 17 00:00:00 2001
> > > > > > > > From: Michal Hocko <mhocko@xxxxxxxx>
> > > > > > > > Date: Thu, 12 Mar 2020 09:04:35 +0100
> > > > > > > > Subject: [PATCH] mm: do not allow MADV_PAGEOUT for CoW pages
> > > > > > > >
> > > > > > > > Jann has brought up a very interesting point [1]. While shared pages are
> > > > > > > > excluded from MADV_PAGEOUT normally, CoW pages can be easily reclaimed
> > > > > > > > that way. This can lead to all sorts of hard to debug problems. E.g.
> > > > > > > > performance problems outlined by Daniel [2]. There are runtime
> > > > > > > > environments where there is a substantial memory shared among security
> > > > > > > > domains via CoW memory and a easy to reclaim way of that memory, which
> > > > > > > > MADV_{COLD,PAGEOUT} offers, can lead to either performance degradation
> > > > > > > > in for the parent process which might be more privileged or even open
> > > > > > > > side channel attacks. The feasibility of the later is not really clear
> > > > > > >
> > > > > > > I am not sure it's a good idea to mention performance stuff because
> > > > > > > it's rather arguble. You and Johannes already pointed it out when I sbumit
> > > > > > > early draft which had shared page filtering out logic due to performance
> > > > > > > reason. You guys suggested the shared pages has higher chance to be touched
> > > > > > > so that if it's really hot pages, that whould keep in the memory. I agree.
> > > > > >
> > > > > > Yes, the hot memory is likely to be referenced but the point was an
> > > > > > unexpected latency because of the major fault. I have to say that I have
> > > > >
> > > > > I don't understand your point here. If it's likely to be referenced
> > > > > among several processes, it doesn't have the major fault latency.
> > > > > What's your point here?
> > > >
> > > > a) the particular CoW page might be cold enough to be reclaimed and b)
> > >
> > > If it is, that means it's *cold* so it's really worth to be reclaimed.
> > >
> > > > nothing really prevents the MADV_PAGEOUT to be called faster than the
> > > > reference bit being readded.
> > >
> > > Yeb, that's undesirable. I should admit it was not intended when I implemented
> > > PAGEOUT. The thing is page_check_references clears access bit of pte for every
> > > process are sharing the page so that two times MADV_PAGEOUT from a process could
> > > evict the page. That's the really bug.
> >
> > I do not really think this is a bug. This is a side effect of the
> > reclaim process and we do not really want MADV_{PAGEOUT,COLD} behave
>
> No, that's the bug since we didn't consider the side effect.
>
> > differently here because then the behavior would be even harder to
>
> No, I do want to have difference because it's per-process hint. IOW,
> what he know is for only his context, not others so it shouldn't clean
> others' pte. That makes difference between LRU aging and the hint.

Just to make it clear, are you really suggesting to special case
page_check_references for madvise path?

--
Michal Hocko
SUSE Labs