Re: 2.6.21-rc3-mm1

From: Mel Gorman
Date: Thu Mar 15 2007 - 06:16:47 EST


On (14/03/07 17:07), Andrew Morton didst pronounce:
> > On Wed, 14 Mar 2007 20:06:02 +0100 Mariusz Kozlowski <m.kozlowski@xxxxxxxxxx> wrote:
> > Hello,
> >
> > Today after +- 24h of uptime I found some more page allocation
> > failures ('eth1: Can't allocate skb for Rx'). You'll find more here:
> >
> > http://tuxland.pl/misc/2.6.21-rc3-mm1-page-allocation-failure.txt
> >
> > System wasn't doing anything unusual, as usual ;-) X, some p2p
> > software, firefox+flash playing music.
> >
>
> Do other kernels do this, or is 2.6.21-rc3-mm1 worse?
>
> It is of course a non-fatal problem and will inevitably happen sometimes,
> but we would like the VM to be able to minimise the occurrence of this
> problem.
>

I'm looking at this now. In the vanilla allocator, min_free_kbytes has the
effect of keeping largest blocks free for as long as possible. This means
that if high-order allocations are rare, they'll tend to succeed up to a point.

In the case of the earlier report on HID failing the order-5 allocation, it's
because we didn't reclaim for long enough because the order was too high. It
would need to reclaim for quite a long time before it would get an order-5
block free. That is what Andy's lumpy-reclaim patches address - reclaiming
intelligently when we know it has a chance of succeeding instead of giving up.

> I think we were rather hoping that Mel's anti-fragmentation work would
> improve things.

I'm still optimistic it will. This is the widest it's been tested so far
so there were going to be bases

In this case, the order is high but GFP_ATOMIC. These allocations get grouped
together but with a low min_free_kbytes, as is the case on this machine,
the high-order atomic blocks eventually get used by others. I had assumed
that if high-order atomic allocations were important (e.g. e1000) that the
machine would be configured with min_free_kbytes of at least 16384 i.e.
1 MAX_ORDER_NR_PAGES per MIGRATE_TYPE.

Mariusz, I would be interested in finding out if this problem still occurs when
you set min_free_kbytes to 16384 via /proc/sys/vm/min_free_kbytes. I understand
that the problem is not easily reproduced and requiring configuration changes
is far from ideal but it'd allow me to find out if options 2 or 3 below make
sense in advance.

I'm looking at three ways of addressing this;

1. Anti-fragmentation currently favours breaking larger blocks to
group-by-mobility as opposed to the vanilla allocator which always favours
using the smallest possible block no matter where it is. I'm looking
at preserving the behaviour of the vanilla allocator to keeping larger
blocks free as much as possible to see how adverse an effect that
subsequently has on fragmentation.

2. Set min_free_kbytes higher automatically at boot time when
CONFIG_PAGE_GROUP_BY_MOBILITY is set.

3. Disable PAGE_GROUP_BY_MOBILITY when min_free_kbytes is set too low
and print KERN_INFO messages when it's disabled or enabled based on
min_free_kbytes

I'm testing patches for 1 at the moment even though I think it will
lower success rates for superpage allocations because of increased
fragmentation. Benchmarking will tell me for sure.

Thanks Mariusz for the report.

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/