Re: [PATCH v3] x86/fault: Send a SIGBUS to user process always for hwpoison page access.

From: Aili Yao
Date: Wed Mar 03 2021 - 12:11:42 EST


On Wed, 3 Mar 2021 20:24:02 +0800
Aili Yao <yaoaili@xxxxxxxxxxxx> wrote:

> On Mon, 1 Mar 2021 11:09:36 -0800
> Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>
> > > On Mar 1, 2021, at 11:02 AM, Luck, Tony <tony.luck@xxxxxxxxx> wrote:
> > >
> > > 
> > >>
> > >> Some programs may use read(2), write(2), etc as ways to check if
> > >> memory is valid without getting a signal. They might not want
> > >> signals, which means that this feature might need to be configurable.
> > >
> > > That sounds like an appalling hack. If users need such a mechanism
> > > we should create some better way to do that.
> > >
> >
> > Appalling hack or not, it works. So, if we’re going to send a signal to user code that looks like it originated from a bina fide architectural recoverable fault, it needs to be recoverable. A load from a failed NVDIMM page is such a fault. A *kernel* load is not. So we need to distinguish it somehow.
>
> Sorry for my previous mis-understanding, and i have some questions:
> if programs use read,write to check if if memory is valid, does it really want to cover the poison case?
> When for such a case, an error is returned, can the program realize it's hwposion issue not other software error and process correctly?
>
> if this is the proper action, the original posion flow in current code from read and write need to change too.
>

Sorry, another question:
When programs use read(2), write(2) as ways to check if memory is valid, does it really want to check if the user page the program provided is valid, not the destination or disk space valid?
the patch will not affect this purpose as it's only valid for user page which program provide to write or some syscall similiar parameter.

--
Thanks!
Aili Yao