Re: v3.4-rc2 out-of-memory problems (was Re: 3.4-rc1 sticks-and-crashs)

From: Colin Cross
Date: Mon Apr 09 2012 - 21:22:00 EST


On Mon, Apr 9, 2012 at 5:32 PM, David Rientjes <rientjes@xxxxxxxxxx> wrote:
> On Mon, 9 Apr 2012, Colin Cross wrote:
>
>> The point of the lowmem_deathpending patch was to avoid a stutter
>> where the cpu would spend its time looping through the tasks due to
>> repeated calls to lowmem_shrink instead of processing the kill signal
>> to the selected thread.
>
> What did you do to avoid this without CONFIG_PROFILING?
>
>> With this patch, it will still loop through
>> tasks until it finds the one that was previously killed and then
>> abort.  It's possible that the improvements Anton made to the task
>> loop reduce the performance impact enough that this whole mess could
>> just be dropped (by reverting 1eda516, e5d7965, and 4755b72).
>>
>
> I don't understand how calling shrink_slab() from direct reclaim or using
> drop_caches manually taking slightly longer because it has to iterate the
> tasklist to the point of the killed thread will significantly stall the
> thread from exiting.

Before Anton's fix, iterating the tasklist involved taking every task
lock, which probably made it very expensive. I tried a quick test
where I deliberately limited memory to the point that it was
triggering lowmemorykiller during boot, and it triggered about 5000
times taking on the order of 50ms total for all 5000 calls. It was
about the same with your patch applied.

> Much more likely is the killed thread cannot exit because you've killed it
> in a lowmem situation without giving it access to memory reserves so that
> it may exit quickly as my patch does.  That has a higher liklihood of
> stalling the exit than doing for_each_process().
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/