Re: [Patch] Call cond_resched() at bottom of main look in balance_pgdat()

From: Minchan Kim
Date: Tue Jun 22 2010 - 00:29:34 EST


On Tue, Jun 22, 2010 at 12:23 PM, KOSAKI Motohiro
<kosaki.motohiro@xxxxxxxxxxxxxx> wrote:
>> >> Kosaki's patch's goal is that kswap doesn't yield cpu if the zone doesn't meet its
>> >> min watermark to avoid failing atomic allocation.
>> >> But this patch could yield kswapd's time slice at any time.
>> >> Doesn't the patch break your goal in bb3ab59683?
>> >
>> > No. it don't break.
>> >
>> > Typically, kswapd periodically call shrink_page_list() and it call
>> > cond_resched() even if bb3ab59683 case.
>>
>> Hmm. If it is, bb3ab59683 is effective really?
>>
>> The bb3ab59683's goal is prevent CPU yield in case of free < min_watermark.
>> But shrink_page_list can yield cpu from kswapd at any time.
>> So I am not sure what is bb3ab59683's benefit.
>> Did you have any number about bb3ab59683's effectiveness?
>> (Of course, I know it's very hard. Just out of curiosity)
>>
>> As a matter of fact, when I saw this Larry's patch, I thought it would
>> be better to revert bb3ab59683. Then congestion_wait could yield CPU
>> to other process.
>>
>> What do you think about?
>
> No. The goal is not prevent CPU yield. The goal is avoid unnecessary
> _long_ sleep (i.e. congestion_wait(BLK_RW_ASYNC, HZ/10)).

I meant it.

> Anyway we can't refuse CPU yield on UP. it lead to hangup ;)
>
> What do you mean the number? If it mean how much reduce congestion_wait(),
> it was posted a lot of time. If it mean how much reduce page allocation
> failure bug report, I think it has been observable reduced since half
> years ago.

I meant second.
Hmm. I doubt it's observable since at that time, Mel had posted many
patches to reduce page allocation fail. bb3ab59683 was just one of
them.

>
> If you have specific worried concern, can you please share it?
>

My concern is that I don't want to add new band-aid on uncertain
feature to solve
regression of uncertain feature.(Sorry for calling Larry's patch as band-aid.).
If we revert bb3ab59683, congestion_wait in balance_pgdat could yield
cpu from kswapd.

If you insist on bb3ab59683's effective and have proved it at past, I
am not against it.

And If it's regression of bb3ab59683, Doesn't it make sense following as?
It could restore old behavior.

---
* OK, kswapd is getting into trouble. Take a nap, then take
* another pass across the zones.
*/
if (total_scanned && (priority < DEF_PRIORITY - 2)) {
if (has_under_min_watermark_zone) {
count_vm_event(KSWAPD_SKIP_CONGESTION_WAIT);
/* allowing CPU yield to go on
watchdog or OOMed task */
cond_resched();
}
else
congestion_wait(BLK_RW_ASYNC, HZ/10);
}


--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/