Re: [PATCH] mm,hwpoison: return -EBUSY when page already poisoned

From: HORIGUCHI NAOYA(堀口 直也)
Date: Thu Feb 25 2021 - 06:32:30 EST


On Thu, Feb 25, 2021 at 11:43:29AM +0800, Aili Yao wrote:
> On Wed, 24 Feb 2021 11:31:55 +0100 Oscar Salvador <osalvador@xxxxxxx> wrote:
...
>
> > >
> > > 3.The kill_me_maybe will check the return:
> > >
> > > 1244 static void kill_me_maybe(struct callback_head *cb)
> > > 1245 {
> > >
> > > 1254 if (!memory_failure(p->mce_addr >> PAGE_SHIFT, flags) &&
> > > 1255 !(p->mce_kflags & MCE_IN_KERNEL_COPYIN)) {
> > > 1256 set_mce_nospec(p->mce_addr >> PAGE_SHIFT,
> >
> > So, IIUC, in case of a LMCE nested call, the second MCE will reach here.
> > set_mce_nospec() will either mark the underlying page as not mapped/cached.
> >
> This set_mce_nospec() is not proper when the recovery job is on the fly. In my test
> this function failed.

Hi Aili,

I agree that this set_mce_nospec() is not expected to be called for
"already hwpoisoned" page because in the reported case the error
page is already contained and no need to resort changing cache mode.

...

> > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> > > index e9481632fcd1..06f006174b8c 100644
> > > --- a/mm/memory-failure.c
> > > +++ b/mm/memory-failure.c
> > > @@ -1224,7 +1224,7 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags)
> > > if (TestSetPageHWPoison(head)) {
> > > pr_err("Memory failure: %#lx: already hardware poisoned\n",
> > > pfn);
> > > - return 0;
> > > + return -EBUSY;
> >
> > As David said, madvise_inject_error() will start returning -EBUSY now in case
> > we madvise(MADV_HWPOISON) on an already hwpoisoned page.
> >
> > AFAICS, memory_failure() can return 0, -Eerrors, and MF_XXX.
> > Would it make sense to unify that? That way we could declare error codes that
> > make somse sense (like MF_ALREADY_HWPOISONED).

It seems to me that memory_failure() does not return MF_XXX. But yes,
returning some positive value for the reported case could be a solution.

> >
>
> @David:
>
> I checked the code again, and find a few places will care the exact return value, like:
>
> 1: drivers/base/memory.c:483: ret = memory_failure(pfn, 0);
> This is for hard page offline, I see the code in mcelog:
> static void offline_action(struct mempage *mp, u64 addr)
> {
> if (offline <= OFFLINE_ACCOUNT)
> return;
> Lprintf("Offlining page %llx\n", addr);
> if (memory_offline(addr) < 0) {
> Lprintf("Offlining page %llx failed: %s\n", addr, strerror(errno));
> mp->offlined = PAGE_OFFLINE_FAILED;
> } else
> mp->offlined = PAGE_OFFLINE;
> }
> I think return an negative value will be more proper? As the related killing function may not be performed, and we can't say
> it's a success operation?
>
> 2:mm/hwpoison-inject.c:51: return memory_failure(pfn, 0);
> mm/madvise.c:910: ret = memory_failure(pfn, MF_COUNT_INCREASED);
>
> These two cases are mainly for error injections, I checked the test codes, mostly it only care if the value is 0 or < 0;
> I do the related test, normally it work well, but for stress test, sometimes in some case, I do meet some fail cases along with the -EBUSY return.
> I will dig more.
>
> Other place will only care if the return value is 0. or just ignore it.
>
> Hi naoya, what's your opnion for this possible issue, I need your inputs!

We could use some negative value (error code) to report the reported case,
then as you mentioned above, some callers need change to handle the
new case, and the same is true if you use some positive value.
My preference is -EHWPOISON, but other options are fine if justified well.

Thanks,
Naoya Horiguchi