Re: [PATCH] memory hotplug: fix page_zone() calculation intest_pages_isolated()

From: Dave Hansen
Date: Mon Oct 27 2008 - 13:26:18 EST


On Mon, 2008-10-27 at 17:49 +0100, Gerald Schaefer wrote:
> My last bugfix here (adding zone->lock) introduced a new problem: Using
> pfn_to_page(pfn) to get the zone after the for() loop is wrong. pfn then
> points to the first pfn after end_pfn, which may be in a different zone
> or not present at all. This may lead to an addressing exception in
> page_zone() or spin_lock_irqsave().

I'm not sure I follow. Let's look at the code, pre-patch:

> for (pfn = start_pfn; pfn < end_pfn; pfn += pageblock_nr_pages) {
> page = __first_valid_page(pfn, pageblock_nr_pages);
> if (page && get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
> break;
> }
> if (pfn < end_pfn)
> return -EBUSY;

We have two ways out of the loop:
1. 'page' is valid, and not isolated, so we did a 'break'
2. No page hit (1) in the range and we broke out of the loop because
of the for() condition: (pfn < end_pfn).

So, when the condition happens that you mentioned in your changelog
above: "pfn then points to the first pfn after end_pfn", we jump out at
the 'return -EBUSY;'. We don't ever do pfn_to_page() in that case since
we've returned befoer.

Either 'page' is valid *OR* you return -EBUSY. I don't think you need
to check both.

> Using the last valid page that was found inside the for() loop, instead
> of pfn_to_page(), should fix this.
> @@ -130,10 +130,10 @@ int test_pages_isolated(unsigned long st
> if (page && get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
> break;
> }
> - if (pfn < end_pfn)
> + if ((pfn < end_pfn) || !page)
> return -EBUSY;
> /* Check all pages are free or Marked as ISOLATED */
> - zone = page_zone(pfn_to_page(pfn));
> + zone = page_zone(page);

I think this patch fixes the bug, but for reasons other than what you
said. :)

The trouble here is that the 'pfn' could have been in the middle of a
hole somewhere, which __first_valid_page() worked around. Since you
saved off the result of __first_valid_page(), it ends up being OK with
your patch.

Instead of using pfn_to_page() you could also have just called
__first_valid_page() again. But, that would have duplicated a bit of
work, even though not much in practice because the caches are still hot.

Technically, you wouldn't even need to check the return from
__first_valid_page() since you know it has a valid result because you
made the exact same call a moment before.

Anyway, can you remove the !page check, fix up the changelog and resend?

-- Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/