Re: [syzbot] WARNING in follow_hugetlb_page

From: John Hubbard
Date: Fri May 13 2022 - 22:55:16 EST


On 5/13/22 17:56, John Hubbard wrote:
On 5/13/22 17:26, Minchan Kim wrote:
Anything else further can we get insight from the warning?

For example, pin_user_pages going on against a hugetlb page
which are concurrently running alloc_contig_range(it's
exported function so anyone can call randomly) so
alloc_contig_range changes pageblock type as MIGRATE_ISOLATE
under us so the hit at the warning?

Well, yes. First of all, the comments above the warning that fired have
gone a little bit stale: they claim that we can only hit the warning if
the page refcount overflows. However, we almost certainly got here via:

try_grab_folio()
    /*
     * Can't do FOLL_LONGTERM + FOLL_PIN gup fast path if not in a
     * right zone, so fail and let the caller fall back to the slow
     * path.
     */
    if (unlikely((flags & FOLL_LONGTERM) &&
             !is_pinnable_page(page))) /* which we just changed */

Specifically, the recent patch effectively acted as an error injection
test, by forcing is_pinnable_page() to always return true (if CONFIG_CMA
is defined). Because: MIGRATE_CMA|MIGRATE_ISOLATE == 7, which will match
any of the MIGRATE_* enums when checked with bitwise AND.

I suspect this particular error path has not been exercised much, or if
it has, not reported here anyway. Until now.


thanks,
--
John Hubbard
NVIDIA