Re: [PATCH v2 4/4] mm/vmscan: Don't mess with pgdat->flags in memcg reclaim.

From: Shakeel Butt
Date: Fri Apr 06 2018 - 10:15:50 EST


On Fri, Apr 6, 2018 at 4:44 AM, Andrey Ryabinin <aryabinin@xxxxxxxxxxxxx> wrote:
> On 04/06/2018 05:13 AM, Shakeel Butt wrote:
>> On Fri, Mar 23, 2018 at 8:20 AM, Andrey Ryabinin
>> <aryabinin@xxxxxxxxxxxxx> wrote:
>>> memcg reclaim may alter pgdat->flags based on the state of LRU lists
>>> in cgroup and its children. PGDAT_WRITEBACK may force kswapd to sleep
>>> congested_wait(), PGDAT_DIRTY may force kswapd to writeback filesystem
>>> pages. But the worst here is PGDAT_CONGESTED, since it may force all
>>> direct reclaims to stall in wait_iff_congested(). Note that only kswapd
>>> have powers to clear any of these bits. This might just never happen if
>>> cgroup limits configured that way. So all direct reclaims will stall
>>> as long as we have some congested bdi in the system.
>>>
>>> Leave all pgdat->flags manipulations to kswapd. kswapd scans the whole
>>> pgdat, only kswapd can clear pgdat->flags once node is balance, thus
>>> it's reasonable to leave all decisions about node state to kswapd.
>>
>> What about global reclaimers? Is the assumption that when global
>> reclaimers hit such condition, kswapd will be running and correctly
>> set PGDAT_CONGESTED?
>>
>
> The reason I moved this under if(current_is_kswapd()) is because only kswapd
> can clear these flags. I'm less worried about the case when PGDAT_CONGESTED falsely
> not set, and more worried about the case when it falsely set. If direct reclaimer sets
> PGDAT_CONGESTED, do we have guarantee that, after congestion problem is sorted, kswapd
> ill be woken up and clear the flag? It seems like there is no such guarantee.
> E.g. direct reclaimers may eventually balance pgdat and kswapd simply won't wake up
> (see wakeup_kswapd()).
>
>
Thanks for the explanation, I think it should be in the commit message.