Re: OOM: Better, but still there on 4.9

From: Chris Mason
Date: Fri Dec 16 2016 - 17:47:46 EST


On 12/16/2016 05:14 PM, Michal Hocko wrote:
On Fri 16-12-16 13:15:18, Chris Mason wrote:
On 12/16/2016 02:39 AM, Michal Hocko wrote:
[...]
I believe the right way to go around this is to pursue what I've started
in [1]. I will try to prepare something for testing today for you. Stay
tuned. But I would be really happy if somebody from the btrfs camp could
check the NOFS aspect of this allocation. We have already seen
allocation stalls from this path quite recently

Just double checking, are you asking why we're using GFP_NOFS to avoid going
into btrfs from the btrfs writepages call, or are you asking why we aren't
allowing highmem?

I am more interested in the NOFS part. Why cannot this be a full
GFP_KERNEL context? What kind of locks we would lock up when recursing
to the fs via slab shrinkers?


Since this is our writepages call, any jump into direct reclaim would go to writepage, which would end up calling the same set of code to read metadata blocks, which would do a GFP_KERNEL allocation and end up back in writepage again.

We'd also have issues with blowing through transaction reservations since the writepage recursion would have to nest into the running transaction.

-chris