Re: [PATCH] [31/31] HWPOISON: Add a madvise() injector for softpage offlining

From: Andi Kleen
Date: Sat Jun 19 2010 - 09:30:11 EST


On Sat, Jun 19, 2010 at 03:25:16PM +0200, Michael Kerrisk wrote:
> Hi Andi,
>
> Thanks for this. Some comments below.
>
> On Sat, Jun 19, 2010 at 3:20 PM, Andi Kleen <andi@xxxxxxxxxxxxxx> wrote:
> > On Sat, Jun 19, 2010 at 02:36:28PM +0200, Michael Kerrisk wrote:
> >> Hi Andi,
> >>
> >> On Tue, Dec 8, 2009 at 11:16 PM, Andi Kleen <andi@xxxxxxxxxxxxxx> wrote:
> >> >
> >> > Process based injection is much easier to handle for test programs,
> >> > who can first bring a page into a specific state and then test.
> >> > So add a new MADV_SOFT_OFFLINE to soft offline a page, similar
> >> > to the existing hard offline injector.
> >>
> >> I see that this made its way into 2.6.33. Could you write a short
> >> piece on it for the madvise.2 man page?
> >
> > Also fixed the previous snippet slightly.
>
> (thanks)
>
> > commit edb43354f0ffc04bf4f23f01261f9ea9f43e0d3d
> > Author: Andi Kleen <ak@xxxxxxxxxxxxxxx>
> > Date:   Sat Jun 19 15:19:28 2010 +0200
> >
> >    MADV_SOFT_OFFLINE
> >
> >    Signed-off-by: Andi Kleen <ak@xxxxxxxxxxxxxxx>
> >
> > diff --git a/man2/madvise.2 b/man2/madvise.2
> > index db29feb..9dccd97 100644
> > --- a/man2/madvise.2
> > +++ b/man2/madvise.2
> > @@ -154,7 +154,15 @@ processes.
> >  This operation may result in the calling process receiving a
> >  .B SIGBUS
> >  and the page being unmapped.
> > -This feature is intended for memory testing.
> > +This feature is intended for testing of memory error handling code.
> > +This feature is only available if the kernel was configured with
> > +.BR CONFIG_MEMORY_FAILURE .
> > +.TP
> > +.BR MADV_SOFT_OFFLINE " (Since Linux 2.6.33)
> > +Soft offline a page. This will result in the memory of the page
> > +being copied to a new page and original page be offlined. The operation
>
> Can you explain the term "offlined" please.

The memory is not used anymore and taken out of normal
memory management (until unpoisoned)
and the "HardwareCorrupted:" counter in /proc/meminfo increases

(don't put the later in, I'm thinking about changing that)

>
> > +should be transparent to the calling process.
>
> Does "should be transparent" mean "is normally invisible"?

Yes. It's similar to being swapped out and swapped in again.

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/