Re: [PATCH 1/2] Avoid putting a bad page back on the LRU

From: Russ Anderson
Date: Wed Apr 08 2009 - 09:31:32 EST


On Wed, Apr 08, 2009 at 05:43:15AM +0200, Ingo Oeser wrote:
> Hi Russ,
>
> On Wednesday 08 April 2009, Russ Anderson wrote:
> > --- linux-next.orig/mm/migrate.c 2009-04-07 18:32:12.781949840 -0500
> > +++ linux-next/mm/migrate.c 2009-04-07 18:34:19.169736260 -0500
> > @@ -693,6 +696,26 @@ unlock:
> > * restored.
> > */
> > list_del(&page->lru);
> > +#ifdef CONFIG_MEMORY_FAILURE
> > + if (PagePoison(page)) {
> > + if (rc == 0)
> > + /*
> > + * A page with a memory error that has
> > + * been migrated will not be moved to
> > + * the LRU.
> > + */
> > + goto move_newpage;
> > + else
> > + /*
> > + * The page failed to migrate and will not
> > + * be added to the bad page list. Clearing
> > + * the error bit will allow another attempt
> > + * to migrate if it gets another correctable
> > + * error.
> > + */
> > + ClearPagePoison(page);
>
> Clearing the flag doesn't change the fact, that this page is representing
> permanently bad RAM.

Yes, but this is intended for corrected memory errors (meaning there is
an underlying RAM error, but has not reached the point of losing data).

After talking with Andi, it is clear the intent of the Poison flag
(uncorrectable memory error) is different from my intent (corrected
memory error). I'll go back to using a different page flag to avoid
confusing the two issues.

> What about removing it from the LRU and adding it to a bad RAM list in every case?

That is what happens when the page migrates (the normal case). The else case
s when the page could not be migrated. My intent was to wait for the next
corrected error on that page and try migrating again.

> After hot swapping the physical RAM banks it could be moved back, not before.

As soon as the code is written. :-)

--
Russ Anderson, OS RAS/Partitioning Project Lead
SGI - Silicon Graphics Inc rja@xxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/