Re: [oom]: [0/4] fix OOM deadlock running OAST

From: Andrew Morton
Date: Wed Jun 23 2004 - 18:37:19 EST


William Lee Irwin III <wli@xxxxxxxxxxxxxx> wrote:
>
> On Wed, Jun 23, 2004 at 03:37:58PM -0700, Andrew Morton wrote:
> > What about zone->all_unreclaimable?
>
> It's unclear which zones must be checked for this to be of use. In
> general, suppose one determines all possible zones from which a given
> allocation may be satisfied (this still assumes out_of_memory() is
> passed a gfp_mask, and then discovers ->all_unreclaimable. It is still
> not possible to determine whether nr_swap_pages is relevant.

I don't think it _is_ relevant. Wev'e scanned the crap out of all the
eligible zones, found nothing swappable outable or otherwise reclaimable.
That's as good a definition of oom as you're likely to get. It takes care
of mlocked user memory too.

> ...
> There are larger semantic questions also. For instance, the runs that
> deadlocked were with non-overcommit enabled (that is, I left the sysctl
> /proc/sys/vm/overcommit_memory == 0 after boot). The intention was to
> provide best effort unfailability to userspace allocations by detecting
> insufficient resources up-front and returning MAP_FAILED from mmap() and
> so on. There is a current hole in this, which is that pinned kernel
> memory allocations aren't subtracted from the pool of memory considered
> to be available to userspace.

The overcommit logic doesn't need to care about pinned kernel allocations.
All it cares about is reclaimable and free pages. Its calculation of what
is reclaimable is a bit wonky because of ramfs/sysfs/ramdisk pages and
metadata, and because of mlock.


But for this problem, I do suspect that going oom if all the eligible zones
are pegged out should suffice. It is certainly accurate.

It could be that the ->all_unreclaimable logic has rotted over the ages and
may ned some attention - I haven't explicitly checked it in a year or more.
Also there's some problem at present wherein the oomkiller takes 30 or
more seconds to kick in and do its thing, which I need to look at. This
happens when there's no swap online. It might also happen when swapspace
is full - I didn't check.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/