Re: [PATCH 0/7] mm: Improve swap path scalability with batched operations

From: Minchan Kim
Date: Wed May 04 2016 - 20:09:11 EST

On Wed, May 04, 2016 at 05:25:06PM -0400, Johannes Weiner wrote:
> On Wed, May 04, 2016 at 09:49:02PM +0200, Michal Hocko wrote:
> > On Wed 04-05-16 10:13:06, Tim Chen wrote:
> > In order this to work other quite intrusive changes to the current
> > reclaim decisions would have to be made though. This is what I tried to
> > say. Look at get_scan_count() on how we are making many steps to ignore
> > swappiness or prefer the page cache. Even when we make swapout scale it
> > won't help much if we do not swap out that often. That's why I claim
> > that we really should think more long term and maybe reconsider these
> > decisions which were based on the rotating rust for the swap devices.
> While I agree that such balancing rework is necessary to make swap
> perform optimally, I don't see why this would be a dependency for
> making the mechanical swapout paths a lot leaner.

I agree.

> I'm actually working on improving the LRU balancing decisions for fast
> random IO swap devices, and hope to have something to submit soon.

Good to hear! I really have an interest about that because we already
have used such fast random IO swap device. zRAM although I'm not
sure it's really fast as much as such NVRAM. Anyway, it would be very
benefit for zram.

> > > I understand that the patch set is a little large. Any better
> > > ideas for achieving similar ends will be appreciated.  I put
> > > out these patches in the hope that it will spur solutions
> > > to improve swap.
> > >
> > > Perhaps the first two patches to make shrink_page_list into
> > > smaller components can be considered first, as a first step 
> > > to make any changes to the reclaim code easier.
> It makes sense that we need to batch swap allocation and swap cache
> operations. Unfortunately, the patches as they stand turn
> shrink_page_list() into an unreadable mess. This would need better
> refactoring before considering them for upstream merging. The swap
> allocation batching should not obfuscate the main sequence of events
> that is happening for both file-backed and anonymous pages.
> It'd also be great if the remove_mapping() batching could be done
> universally for all pages, given that in many cases file pages from
> the same inode also cluster together on the LRU.
> I realize this is fairly vague feedback; I'll try to take a closer
> look at the patches. But I do think this work is going in the right
> direction and there is plenty of justification for making these paths
> more efficient.