[PATCH 0/7 v2] More graceful flusher thread memory reclaim wakeup

From: Jens Axboe
Date: Wed Sep 20 2017 - 11:33:21 EST


We've had some issues with writeback in presence of memory reclaim
at Facebook, and this patch set attempts to fix it up. The real
functional change is the last patch in the series, the first 5 are
prep and cleanup patches.

The basic idea is that we have callers that call
wakeup_flusher_threads() with nr_pages == 0. This means 'writeback
everything'. For memory reclaim situations, we can end up queuing
a TON of these kinds of writeback units. This can cause softlockups
and further memory issues, since we allocate huge amounts of
struct wb_writeback_work to handle this writeback. Handle this
situation more gracefully.

Changes since v1:

- Rename WB_zero_pages to WB_start_all (Amir).
- Remove a test_bit() for a condition where we always expect the bit
to be set.
- Remove 'nr_pages' from the wakeup flusher threads helpers, since
everybody now passes in zero. Enables further cleanups in later
patches too (Jan).
- Fix a case where I forgot to clear WB_start_all if 'work' allocation
failed.
- Get rid of cond_resched() in the wb_do_writeback() loop.

--
Jens Axboe