Re: upcoming kerneloops.org item: get_page_from_freelist

From: Theodore Tso
Date: Thu Jun 25 2009 - 15:38:39 EST


On Thu, Jun 25, 2009 at 11:51:40AM -0700, David Rientjes wrote:
>
> There's no way to indicate that the page allocator should "try really
> hard" because the VM implementation should already do that for every
> allocation before failure. A subsequent attempt after the first failure
> could try GFP_ATOMIC, though, which allows allocation beyond the minimum
> watermark and is more likely to succeed than GFP_NOFS. Such an
> allocation should be short-lived and not rely on additional memory to free
> to avoid depleting most of the memory reserves available to atomic
> allocations, direct reclaim, and oom killed tasks.

Hmm, is there a reason to avoid using GFP_ATOMIC on the first
allocation, and only adding GFP_ATOMIC after the first failure?

In the case of ext4, after we finish the commit, we will release quite
a bit of memory to the system, so using GFP_ATOMIC to complete is a
good thing. Of course, preallocating some of these data structures
before the commit would be better, since we can return ENOMEM to
userspace applications when they are calling a system call.

> > Hmm.... it may be possible to do the memory allocation in advance,
> > before we get to the commit, and make it be easier to fail and return
> > ENOMEM to userspace --- which I bet most applications won't handle
> > gracefully, either (a) not checking error codes and losing data, or
> > (b) dieing on the spot, so it would be effectively be an OOM kill.
>
> If this would still be a GFP_NOFS allocation, the oom killer will not be
> triggered (it only gets called when __GFP_FS is set to avoid killing tasks
> when reclaim was not possible).

I didn't mean that it would really be an OOM kill --- just that many
applications don't have very sophisticated error checking themselves,
and will either not do error checking at all, or if they get an ENOMEM
from a system call, will probably just immediately do a something like
'perror("Yikes!"); exit(1);' --- so it might as _well_ be an OOM kill.

On the other hand, by returning an ENOMEM to userspace, we at least
allow the competent application writers to try to do something
intelligent (cynical kernel programmers who don't believe there are
many such, lets leave that aside for the bar room discussion :-), and
if you're out of memory, you're out of memory, and whether programs
die from an OOM or an untested NULL defeference in an error path in
the application, or an explicit 'perror("Yikes!"); exit(1);', doesn't
much matter.

- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/