Re: [PATCH v3] x86/fault: Send a SIGBUS to user process always for hwpoison page access.

From: Andy Lutomirski
Date: Fri Feb 26 2021 - 22:41:35 EST


On Wed, Feb 24, 2021 at 8:47 PM Aili Yao <yaoaili@xxxxxxxxxxxx> wrote:
>
> On Tue, 23 Feb 2021 08:42:59 -0800
> "Luck, Tony" <tony.luck@xxxxxxxxx> wrote:
>
> > On Tue, Feb 23, 2021 at 07:33:46AM -0800, Andy Lutomirski wrote:
> > >
> > > > On Feb 23, 2021, at 4:44 AM, Aili Yao <yaoaili@xxxxxxxxxxxx> wrote:
> > > >
> > > > On Fri, 5 Feb 2021 17:01:35 +0800
> > > > Aili Yao <yaoaili@xxxxxxxxxxxx> wrote:
> > > >
> > > >> When one page is already hwpoisoned by MCE AO action, processes may not
> > > >> be killed, processes mapping this page may make a syscall include this
> > > >> page and result to trigger a VM_FAULT_HWPOISON fault, as it's in kernel
> > > >> mode it may be fixed by fixup_exception, current code will just return
> > > >> error code to user code.
> > > >>
> > > >> This is not sufficient, we should send a SIGBUS to the process and log
> > > >> the info to console, as we can't trust the process will handle the error
> > > >> correctly.
> > > >>
> > > >> Suggested-by: Feng Yang <yangfeng1@xxxxxxxxxxxx>
> > > >> Signed-off-by: Aili Yao <yaoaili@xxxxxxxxxxxx>
> > > >> ---
> > > >> arch/x86/mm/fault.c | 62 +++++++++++++++++++++++++++++----------------
> > > >> 1 file changed, 40 insertions(+), 22 deletions(-)
> > > >>
> > > > Hi luto;
> > > > Is there any feedback?
> > >
> > > At the very least, this needs a clear explanation of why your proposed behavior is better than the existing behavior.
> >
> > The explanation is buried in that "can't trust the process" line.
> >
> > E.g. user space isn't good about checking for failed write(2) syscalls.
> > So if the poison was in a user buffer passed to write(fd, buffer, count)
> > sending a SIGBUS would be the action if they read the poison directly,
> > so it seems reasonable to send the same signal if the kernel read their
> > poison for them.
> >
> > It would avoid users that didn't check the return value merrily proceeding
> > as if everything was ok.
>
> Hi luto:
> I will add more infomation:
> Even if the process will check return value of syscall like write, I don't think
> process will take proper action for this.
> In test example, the return value will be errno is 14 (Bad Address), the process may not realize
> this is a hw issue, and may take wrong action not as expected.
> And totally, A hw error will rarely happen, and the hw error hitting this branch will be
> more unlikely, the impaction without this patch is quite minor, but this is still not good enough, we should
> make it better, right?

There are a few issues I can imagine:

Some programs may use read(2), write(2), etc as ways to check if
memory is valid without getting a signal. They might not want
signals, which means that this feature might need to be configurable.

It's worth making sure that this doesn't end up sending duplicate
signals. If nothing else, this would impact the vsyscall emulation
code.

Programs that get a signal might expect that the RIP that the signal
frame points to is the instruction that caused the signal and that the
instruction faulted without side effects. For SIGSEGV, I would be
especially nervous about this. Maybe SIGBUS is safer. For SIGSEGV,
it's entirely valid to look at CR2 / si_fault_addr, fix it up, and
return. This would be completely *invalid* with your patch. I'm not
sure what to do about this.

--Andy