Re: [Patch] Call cond_resched() at bottom of main look in balance_pgdat()

From: Minchan Kim
Date: Mon Jun 21 2010 - 22:46:06 EST


On Tue, Jun 22, 2010 at 11:24 AM, KOSAKI Motohiro
<kosaki.motohiro@xxxxxxxxxxxxxx> wrote:
>> > =============================================================
>> > Subject: [PATCH] Call cond_resched() at bottom of main look in balance_pgdat()
>> > From: Larry Woodman <lwoodman@xxxxxxxxxx>
>> >
>> > We are seeing a problem where kswapd gets stuck and hogs the CPU on a
>> > small single CPU system when an OOM kill should occur. ÂWhen this
>> > happens swap space has been exhausted and the pagecache has been shrunk
>> > to zero. ÂOnce kswapd gets the CPU it never gives it up because at least
>> > one zone is below high. ÂAdding a single cond_resched() at the end of
>> > the main loop in balance_pgdat() fixes the problem by allowing the
>> > watchdog and tasks to run and eventually do an OOM kill which frees up
>> > the resources.
>> >
>> > kosaki note: This seems regression caused by commit bb3ab59683
>> > (vmscan: stop kswapd waiting on congestion when the min watermark is
>> > Ânot being met)
>> >
>> > Signed-off-by: Larry Woodman <lwoodman@xxxxxxxxxx>
>> > Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx>
>> > ---
>> > Âmm/vmscan.c | Â Â1 +
>> > Â1 files changed, 1 insertions(+), 0 deletions(-)
>> >
>> > diff --git a/mm/vmscan.c b/mm/vmscan.c
>> > index 9c7e57c..c5c46b7 100644
>> > --- a/mm/vmscan.c
>> > +++ b/mm/vmscan.c
>> > @@ -2182,6 +2182,7 @@ loop_again:
>> > Â Â Â Â Â Â Â*/
>> > Â Â Â Â Â Â if (sc.nr_reclaimed >= SWAP_CLUSTER_MAX)
>> > Â Â Â Â Â Â Â Â Â Â break;
>> > + Â Â Â Â Â cond_resched();
>> > Â Â }
>> > Âout:
>> > Â Â /*
>> > --
>> > 1.6.5.2
>>
>> Kosaki's patch's goal is that kswap doesn't yield cpu if the zone doesn't meet its
>> min watermark to avoid failing atomic allocation.
>> But this patch could yield kswapd's time slice at any time.
>> Doesn't the patch break your goal in bb3ab59683?
>
> No. it don't break.
>
> Typically, kswapd periodically call shrink_page_list() and it call
> cond_resched() even if bb3ab59683 case.

Hmm. If it is, bb3ab59683 is effective really?

The bb3ab59683's goal is prevent CPU yield in case of free < min_watermark.
But shrink_page_list can yield cpu from kswapd at any time.
So I am not sure what is bb3ab59683's benefit.
Did you have any number about bb3ab59683's effectiveness?
(Of course, I know it's very hard. Just out of curiosity)

As a matter of fact, when I saw this Larry's patch, I thought it would
be better to revert bb3ab59683. Then congestion_wait could yield CPU
to other process.

What do you think about?

--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/