Re: [PATCH] mm,vmscan: Use accurate values for zone_reclaimable() checks

From: Tejun Heo
Date: Sun Oct 25 2015 - 18:47:58 EST


Hello,

On Sun, Oct 25, 2015 at 07:52:59PM +0900, Tetsuo Handa wrote:
...
> This means that any kernel code which invokes a __GFP_WAIT allocation
> might fail to do (4) when invoked via workqueue, regardless of flags
> passed to alloc_workqueue()?

Sounds that way and yeah (3) should technically be okay and that's why
HIGHPRI was implemented the way it was at the beginning; however, in
practice, this is the first time it's noticeable in all the years. I
think it comes down to the fact that there just aren't many places
which need such looping behavior and even in those places it's often
very undesirable to busy-loop while not making forward-progress (and
if forward-progress is being made, it won't be indefinite).

> I think that inserting a short sleep into page allocator is better
> because the vmstat_update fix will not require workqueue tweaks if
> we sleep inside page allocator. Also, from the point of view of
> protecting page allocator from going unresponsive when hundreds of tasks
> started busy-waiting at __alloc_pages_slowpath() because we can observe
> that XXX value in the "MemAlloc-Info: XXX stalling task," line grows
> when we are unable to make forward progress.

This looks good to me too; however, it still needs a dedicated
workqueue with WQ_MEM_RECLAIM set. That deadlock probably is very
unlikely as the side effect of vmstat failing to execute due to worker
exhaustion is more memory reclaim but it still is theoretically
possible and it could just be that it happens at low enough frequency
that it hasn't been reported yet.

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/