Re: [PATCH v1 3/3] mm,hwpoison: add kill_accessing_process() to find error virtual address

From: HORIGUCHI NAOYA(堀口 直也)
Date: Tue Apr 20 2021 - 21:04:22 EST


On Tue, Apr 20, 2021 at 08:42:53AM -0700, Luck, Tony wrote:
> On Mon, Apr 19, 2021 at 06:49:15PM -0700, Jue Wang wrote:
> > On Tue, 13 Apr 2021 07:43:20 +0900, Naoya Horiguchi wrote:
> > ...
> > > + * This function is intended to handle "Action Required" MCEs on already
> > > + * hardware poisoned pages. They could happen, for example, when
> > > + * memory_failure() failed to unmap the error page at the first call, or
> > > + * when multiple Action Optional MCE events races on different CPUs with
> > > + * Local MCE enabled.
> >
> > +Tony Luck
> >
> > Hey Tony, I thought SRAO MCEs are broadcasted to all cores in the system
> > as they come without an execution context, is it correct?
> >
> > If Yes, Naoya, I think we might want to remove the comments about the
> > "multiple Action Optional MCE racing" part.
>
> Jue,
>
> Correct. SRAO machine checks are broadcast. But rather than remove the
> second part, just replace with "multiple local machine checks on different
> CPUs".

This looks more precise, so I replaced as such in v3.

Thanks,
Naoya Horiguchi