Re: [PATCH 0/5] mm: vmscan: fix kswapd writeback regression
From: Hillf Danton
Date: Thu Jan 26 2017 - 00:44:55 EST
On January 24, 2017 2:17 AM Johannes Weiner wrote:
> We noticed a regression on multiple hadoop workloads when moving from
> 3.10 to 4.0 and 4.6, which involves kswapd getting tangled up in page
> writeout, causing direct reclaim herds that also don't make progress.
> I tracked it down to the thrash avoidance efforts after 3.10 that make
> the kernel better at keeping use-once cache and use-many cache sorted
> on the inactive and active list, with more aggressive protection of
> the active list as long as there is inactive cache. Unfortunately, our
> workload's use-once cache is mostly from streaming writes. Waiting for
> writes to avoid potential reloads in the future is not a good tradeoff.
> These patches do the following:
> 1. Wake the flushers when kswapd sees a lump of dirty pages. It's
> possible to be below the dirty background limit and still have
> cache velocity push them through the LRU. So start a-flushin'.
> 2. Let kswapd only write pages that have been rotated twice. This
> makes sure we really tried to get all the clean pages on the
> inactive list before resorting to horrible LRU-order writeback.
> 3. Move rotating dirty pages off the inactive list. Instead of
> churning or waiting on page writeback, we'll go after clean active
> cache. This might lead to thrashing, but in this state memory
> demand outstrips IO speed anyway, and reads are faster than writes.
> More details in the individual changelogs.
> include/linux/mm_inline.h | 7 ++++
> include/linux/mmzone.h | 2 --
> include/linux/writeback.h | 2 +-
> include/trace/events/writeback.h | 2 +-
> mm/swap.c | 9 ++---
> mm/vmscan.c | 68 +++++++++++++++-----------------------
> 6 files changed, 41 insertions(+), 49 deletions(-)
Acked-by: Hillf Danton <hillf.zj@xxxxxxxxxxxxxxx>