Re: [PATCH] Mark the correct zone as full when scanning zonelists

From: Mel Gorman
Date: Fri Sep 12 2008 - 14:58:30 EST


On (11/09/08 14:41), Andrew Morton didst pronounce:
> On Thu, 11 Sep 2008 22:25:51 +0100
> Mel Gorman <mel@xxxxxxxxx> wrote:
>
> > The for_each_zone_zonelist() uses a struct zoneref *z cursor when scanning
> > zonelists to keep track of where in the zonelist it is. The zoneref that
> > is returned corresponds to the the next zone that is to be scanned, not
> > the current one as it originally thought of as an opaque list.
> >
> > When the page allocator is scanning a zonelist, it marks zones that it
> > temporarily full zones to eliminate near-future scanning attempts.
>
> That sentence needs help.
>

I've posted a revised leader below.

> > It uses
> > the zoneref for the marking and consequently the incorrect zone gets marked
> > full. This leads to a suitable zone being skipped in the mistaken belief
> > it is full. This patch corrects the problem by changing zoneref to be the
> > current zone being scanned instead of the next one.
>
> Applicable to 2.6.26 as well, yes?
>

Yes. I was going to get it right for mainline first before posting to
stable.

>
> Someone reported a bug a few weeks ago which I think this patch will fix,
> yes? I don't remember who that was, nor do I recall the precise details
> of what the userspace-visible (mis)behaviour was.
>
> Are you able to fill in the gaps here? Put yourself in the position of
> a poor little -stable maintainer scratching his head wondering ytf he
> was sent this patch.
>

I'm not aware of this bug but I'll go digging for it and see what I
find. Thanks

=== Begin revised changelog ===

The iterator for_each_zone_zonelist() uses a struct zoneref *z cursor when
scanning zonelists to keep track of where in the zonelist it is. The zoneref
that is returned corresponds to the the next zone that is to be scanned,
not the current one. It was intended to be treated as an opaque list.

When the page allocator is scanning a zonelist, it marks elements in the
zonelist corresponding to zones that are temporarily full. As the zonelist
is being updated, it uses the cursor here;

if (NUMA_BUILD)
zlc_mark_zone_full(zonelist, z);

This is intended to prevent rescanning in the near future but the zoneref
cursor does not correspond to the zone that has been found to be full. This is
an easy misunderstanding to make so this patch corrects the problem by changing
zoneref cursor to be the current zone being scanned instead of the next one.

This issue affects 2.6.26.

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/