Re: [PATCH v4 01/10] pagemap: Introduce ->memory_failure()

From: Dan Williams
Date: Tue Jun 15 2021 - 20:18:36 EST


On Thu, Jun 3, 2021 at 6:19 PM Shiyang Ruan <ruansy.fnst@xxxxxxxxxxx> wrote:

Hi Ruan, apologies for the delays circling back to this.

>
> When memory-failure occurs, we call this function which is implemented
> by each kind of devices. For the fsdax case, pmem device driver
> implements it. Pmem device driver will find out the filesystem in which
> the corrupted page located in. And finally call filesystem handler to
> deal with this error.
>
> The filesystem will try to recover the corrupted data if possiable.
>

Let's move this change to the patch that needs it, this patch does not
do anything on its own.

> Signed-off-by: Shiyang Ruan <ruansy.fnst@xxxxxxxxxxx>
> ---
> include/linux/memremap.h | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/include/linux/memremap.h b/include/linux/memremap.h
> index 45a79da89c5f..473fe18c516a 100644
> --- a/include/linux/memremap.h
> +++ b/include/linux/memremap.h
> @@ -87,6 +87,14 @@ struct dev_pagemap_ops {
> * the page back to a CPU accessible page.
> */
> vm_fault_t (*migrate_to_ram)(struct vm_fault *vmf);
> +
> + /*
> + * Handle the memory failure happens on one page. Notify the processes
> + * who are using this page, and try to recover the data on this page
> + * if necessary.
> + */

I thought we discussed that this needed to be range based here:

https://lore.kernel.org/r/CAPcyv4jhUU3NVD8HLZnJzir+SugB6LnnrgJZ-jP45BZrbJ1dJQ@xxxxxxxxxxxxxx

...but also incorporate Christoph's feedback to not use notifiers.

> + int (*memory_failure)(struct dev_pagemap *pgmap, unsigned long pfn,
> + int flags);

Change this callback to

int (*notify_memory_failure)(struct dev_pagemap *pgmap, unsigned long
pfn, unsigned long nr_pfns)

...to pass a range and to clarify that this callback is for
memory_failure() to notify the pgmap, the pgmap notifies the owner via
the holder callbacks.