Re: [RFC PATCH 3/3] mm/memory-failure: handle min_order_for_split() error code properly

From: Zi Yan

Date: Thu Nov 20 2025 - 10:01:05 EST


On 19 Nov 2025, at 23:45, Balbir Singh wrote:

> On 11/20/25 14:59, Zi Yan wrote:
>> min_order_for_split() returns -EBUSY when the folio is truncated and cannot
>> be split. In commit 77008e1b2ef7 ("mm/huge_memory: do not change
>> split_huge_page*() target order silently"), memory_failure() does not
>> handle it and pass -EBUSY to try_to_split_thp_page() directly.
>> try_to_split_thp_page() returns -EINVAL since -EBUSY becomes 0xfffffff0 as
>> new_order is unsigned int in __folio_split() and this large new_order is
>> rejected as an invalid input. The code does not cause a bug.
>> soft_offline_in_use_page() also uses min_order_for_split() but it always
>> passes 0 as new_order for split.
>>
>> Handle it properly by checking min_order_for_split() return value and not
>> calling try_to_split_thp_page() if the value is negative. Add a comment
>> in soft_offline_in_use_page() to clarify the possible negative new_order
>> value.
>>
>> Signed-off-by: Zi Yan <ziy@xxxxxxxxxx>
>> ---
>> mm/memory-failure.c | 8 ++++++--
>> 1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>> index 7f908ad795ad..86582f030159 100644
>> --- a/mm/memory-failure.c
>> +++ b/mm/memory-failure.c
>> @@ -2437,8 +2437,11 @@ int memory_failure(unsigned long pfn, int flags)
>> * or unhandlable page. The refcount is bumped iff the
>> * page is a valid handlable page.
>> */
>> - folio_set_has_hwpoisoned(folio);
>> - err = try_to_split_thp_page(p, new_order, /* release= */ false);
>> + if (new_order >= 0) {
>> + folio_set_has_hwpoisoned(folio);
>
> if new_order < 0, do we skip setting hwpoisioned bit on the folio?

The bit should be set. Anyway, I am going to take David’s approach to
change min_order_for_split().

Thanks.

>
>> + err = try_to_split_thp_page(p, new_order, /* release= */ false);
>> + } else
>> + err = new_order;
>> /*
>> * If splitting a folio to order-0 fails, kill the process.
>> * Split the folio regardless to minimize unusable pages.
>> @@ -2779,6 +2782,7 @@ static int soft_offline_in_use_page(struct page *page)
>> /*
>> * If new_order (target split order) is not 0, do not split the
>> * folio at all to retain the still accessible large folio.
>> + * new_order can be -EBUSY, meaning the folio cannot be split.
>> * NOTE: if minimizing the number of soft offline pages is
>> * preferred, split it to non-zero new_order like it is done in
>> * memory_failure().
>
> Balbir


Best Regards,
Yan, Zi