Re: [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio

From: Miaohe Lin
Date: Mon Jan 20 2025 - 01:24:46 EST


On 2025/1/16 14:16, Wupeng Ma wrote:
> From: Ma Wupeng <mawupeng1@xxxxxxxxxx>

Thanks for your patch. Some nits below.

>
> Commit 6da6b1d4a7df ("mm/hwpoison: convert TTU_IGNORE_HWPOISON to
> TTU_HWPOISON") introduce TTU_HWPOISON to replace TTU_IGNORE_HWPOISON
> in order to stop send SIGBUS signal when accessing an error page after
> a memory error on a clean folio. However during page migration, anon
> folio must be set with TTU_HWPOISON during unmap_*(). For pagecache
> we need some policy just like the one in hwpoison_user_mappings to
> set this flag. So move this policy from hwpoison_user_mappings to
> unmap_poisoned_folio to handle this waring properly.

s/waring/warning/g

>
> Waring will be produced during unamp poison folio with the following log:

s/Waring/Warning/g

>
> ------------[ cut here ]------------
> WARNING: CPU: 1 PID: 365 at mm/rmap.c:1847 try_to_unmap_one+0x8fc/0xd3c
> Modules linked in:
> CPU: 1 UID: 0 PID: 365 Comm: bash Tainted: G W 6.13.0-rc1-00018-gacdb4bbda7ab #42
> Tainted: [W]=WARN
> Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
> pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : try_to_unmap_one+0x8fc/0xd3c
> lr : try_to_unmap_one+0x3dc/0xd3c
> Call trace:
> try_to_unmap_one+0x8fc/0xd3c (P)
> try_to_unmap_one+0x3dc/0xd3c (L)
> rmap_walk_anon+0xdc/0x1f8
> rmap_walk+0x3c/0x58
> try_to_unmap+0x88/0x90
> unmap_poisoned_folio+0x30/0xa8
> do_migrate_range+0x4a0/0x568
> offline_pages+0x5a4/0x670
> memory_block_action+0x17c/0x374
> memory_subsys_offline+0x3c/0x78
> device_offline+0xa4/0xd0
> state_store+0x8c/0xf0
> dev_attr_store+0x18/0x2c
> sysfs_kf_write+0x44/0x54
> kernfs_fop_write_iter+0x118/0x1a8
> vfs_write+0x3a8/0x4bc
> ksys_write+0x6c/0xf8
> __arm64_sys_write+0x1c/0x28
> invoke_syscall+0x44/0x100
> el0_svc_common.constprop.0+0x40/0xe0
> do_el0_svc+0x1c/0x28
> el0_svc+0x30/0xd0
> el0t_64_sync_handler+0xc8/0xcc
> el0t_64_sync+0x198/0x19c
> ---[ end trace 0000000000000000 ]---
>
> Fixes: 6da6b1d4a7df ("mm/hwpoison: convert TTU_IGNORE_HWPOISON to TTU_HWPOISON")
> Signed-off-by: Ma Wupeng <mawupeng1@xxxxxxxxxx>
> Suggested-by: David Hildenbrand <david@xxxxxxxxxx>
> ---
> mm/internal.h | 5 ++--
> mm/memory-failure.c | 61 +++++++++++++++++++++++----------------------
> mm/memory_hotplug.c | 3 ++-
> 3 files changed, 36 insertions(+), 33 deletions(-)
>
> diff --git a/mm/internal.h b/mm/internal.h
> index 9826f7dce607..3caee67c0abd 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -1102,7 +1102,7 @@ static inline int find_next_best_node(int node, nodemask_t *used_node_mask)
> * mm/memory-failure.c
> */
> #ifdef CONFIG_MEMORY_FAILURE
> -void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu);
> +int unmap_poisoned_folio(struct folio *folio, unsigned long pfn, bool must_kill);
> void shake_folio(struct folio *folio);
> extern int hwpoison_filter(struct page *p);
>
> @@ -1125,8 +1125,9 @@ unsigned long page_mapped_in_vma(const struct page *page,
> struct vm_area_struct *vma);
>
> #else
> -static inline void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
> +static inline int unmap_poisoned_folio(struct folio *folio, unsigned long pfn, bool must_kill);
> {
> + return -EBUSY;
> }
> #endif
>
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index a7b8ccd29b6f..b5212b6e330a 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1556,8 +1556,34 @@ static int get_hwpoison_page(struct page *p, unsigned long flags)
> return ret;
> }
>
> -void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
> +int unmap_poisoned_folio(struct folio *folio, unsigned long pfn, bool must_kill)
> {
> + enum ttu_flags ttu = TTU_IGNORE_MLOCK | TTU_SYNC | TTU_HWPOISON;
> + struct address_space *mapping;
> +
> + if (folio_test_swapcache(folio)) {
> + pr_err("%#lx: keeping poisoned page in swap cache\n", pfn);
> + ttu &= ~TTU_HWPOISON;
> + }
> +
> + /*
> + * Propagate the dirty bit from PTEs to struct page first, because we
> + * need this to decide if we should kill or just drop the page.
> + * XXX: the dirty test could be racy: set_page_dirty() may not always
> + * be called inside page lock (it's recommended but not enforced).
> + */
> + mapping = folio_mapping(folio);
> + if (!must_kill && !folio_test_dirty(folio) && mapping &&
> + mapping_can_writeback(mapping)) {
> + if (folio_mkclean(folio)) {
> + folio_set_dirty(folio);
> + } else {
> + ttu &= ~TTU_HWPOISON;
> + pr_info("%#lx: corrupted page was clean: dropped without side effects\n",
> + pfn);
> + }
> + }
> +
> if (folio_test_hugetlb(folio) && !folio_test_anon(folio)) {
> struct address_space *mapping;
>
> @@ -1572,7 +1598,7 @@ void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
> if (!mapping) {
> pr_info("%#lx: could not lock mapping for mapped hugetlb folio\n",
> folio_pfn(folio));
> - return;
> + return -EBUSY;
> }
>
> try_to_unmap(folio, ttu|TTU_RMAP_LOCKED);
> @@ -1580,6 +1606,8 @@ void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
> } else {
> try_to_unmap(folio, ttu);
> }
> +
> + return folio_mapped(folio) ? -EBUSY : 0;

Do we really need this return value? It's unused in do_migrate_range().

Thanks.
.