Re: [PATCH] mm/page_alloc: fix deadlock on cpu_hotplug_lock in __accept_page()

From: Dave Hansen
Date: Mon Mar 31 2025 - 15:07:19 EST


On 3/29/25 10:10, Kirill A. Shutemov wrote:
> + if (system_wq)
> + schedule_work(&zone->unaccepted_cleanup);
> + else
> + unaccepted_cleanup_work(&zone->unaccepted_cleanup);
> + }
> }

The 'system_wq' check seems like an awfully big hack. No other
schedule_work() user does anything similar that I can find across the tree.

Instead of hacking in some internal state, could you use 'system_state',
like:

if (system_state == SYSTEM_BOOTING)
unaccepted_cleanup_work(&zone->unaccepted_cleanup);
else
schedule_work(&zone->unaccepted_cleanup);

The other method would be to make it more opportunistic? Basically,
detect when it might deadlock:

bool try_to_dec()
{
if (!cpus_read_trylock())
return false;

static_branch_dec_cpuslocked(&zones_with_unaccepted_pages);
cpus_read_unlock();

return true;
}

That still requires a bit in the zone to say whether the
static_branch_dec() was deferred or not, though. It's kinda open-coding
schedule_work().