Re: How to handle a hugepage with bad physical memory?

From: Robin Holt
Date: Thu Nov 17 2005 - 09:43:21 EST


On Wed, Nov 16, 2005 at 11:58:13AM -0800, Christoph Lameter wrote:
> On Wed, 16 Nov 2005, Robin Holt wrote:
>
> > Russ Anderson recently introduced a patch into ia64 that changes MCA
> > behavior. When the MCA is caused by a user reference to a users memory,
> > we put an extra reference on the page and kill the user. This leaves
> > the working memory available for other jobs while causing a leak of the
> > bad page.
> >
> > I don't know if Russ has done any testing with hugetlbfs pages. I preface
> > the remainder of my comments with a huge "I don't know anything"
> > disclaimer.
> >
> > With the new hugepages concept, would it be possible to only mark
> > the default pagesize portion of a hugepage as bad and then return the
> > remainder of the hugepage for normal use? What would we basically need
> > to do to accomplish this? Are there patches in the community which we
> > should wait to see how they progress before we do any work on this front?
>
> On IA64 we have one PTE for a huge page in a different region, so we
> cannot unmap a page sized section. Other architectures may have PTEs for
> each page sized section of a huge page. For those it may make sense
> (but then the management of the page is done via the first page_struct,
> which likely results in some challenging VM issues).

Christoph,

I think you misunderstood me. I was talking about killing the process.
All the mappings get destroyed. I want to reclaim as much of that huge
page as possible.

Once everything is cleared up, I would like to break the huge page back
into normal-size pages and free those.

Thanks,
Robin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/