Re: [PATCH v1 3/3] mm,hwpoison: add kill_accessing_process() to find error virtual address

From: Luck, Tony
Date: Tue Apr 20 2021 - 11:42:57 EST


On Mon, Apr 19, 2021 at 06:49:15PM -0700, Jue Wang wrote:
> On Tue, 13 Apr 2021 07:43:20 +0900, Naoya Horiguchi wrote:
> ...
> > + * This function is intended to handle "Action Required" MCEs on already
> > + * hardware poisoned pages. They could happen, for example, when
> > + * memory_failure() failed to unmap the error page at the first call, or
> > + * when multiple Action Optional MCE events races on different CPUs with
> > + * Local MCE enabled.
>
> +Tony Luck
>
> Hey Tony, I thought SRAO MCEs are broadcasted to all cores in the system
> as they come without an execution context, is it correct?
>
> If Yes, Naoya, I think we might want to remove the comments about the
> "multiple Action Optional MCE racing" part.

Jue,

Correct. SRAO machine checks are broadcast. But rather than remove the
second part, just replace with "multiple local machine checks on different
CPUs".

-Tony