Re: [PATCH v1] mm/hwpoison: set PageHWPoison after taking page lock in memory_failure_hugetlb()

From: HORIGUCHI NAOYA(堀口 直也)
Date: Wed Mar 09 2022 - 20:15:22 EST


On Wed, Mar 09, 2022 at 01:30:10PM -0800, Andrew Morton wrote:
> On Wed, 9 Mar 2022 18:14:49 +0900 Naoya Horiguchi <naoya.horiguchi@xxxxxxxxx> wrote:
>
> > There is a race condition between memory_failure_hugetlb() and hugetlb
> > free/demotion, which causes setting PageHWPoison flag on the wrong page
> > (which was a hugetlb when memory_failrue() was called, but was removed
> > or demoted when memory_failure_hugetlb() is called). This results in
> > killing wrong processes. So set PageHWPoison flag with holding page lock,
>
> What are the runtime effects of this? Do we think a -stable backport
> is needed?

The actual user-visible effect might be obscure because even if
memory_failure() works as expected, some random process could be killed.
The actual error is left unhandled, so no one prevents later access to it,
which might lead to more serious results like consuming corrupted data.
So I think that this is worth sending -stable backport.

But unfortunately this patch still needs update, could you drop this from
mmotm for a while?

>
> Are we missing a reported-by here? I'm too lazy to hunt down who it was.

I noticed this by Mike's comment, so I'll add his reported-by.

Thanks,
Naoya Horiguchi