Re: [PATCH 1/1] mm/page_alloc: Leave IRQs enabled for per-cpu page allocations

From: Vlastimil Babka
Date: Mon Oct 10 2022 - 16:45:54 EST


On 10/10/22 16:22, Mel Gorman wrote:
On Wed, Aug 24, 2022 at 10:58:26PM -0600, Yu Zhao wrote:
On Wed, Aug 24, 2022 at 8:18 AM Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> wrote:

The pcp_spin_lock_irqsave protecting the PCP lists is IRQ-safe as a task
allocating from the PCP must not re-enter the allocator from IRQ context.
In each instance where IRQ-reentrancy is possible, the lock is acquired using
pcp_spin_trylock_irqsave() even though IRQs are disabled and re-entrancy
is impossible.

Demote the lock to pcp_spin_lock avoids an IRQ disable/enable in the common
case at the cost of some IRQ allocations taking a slower path. If the PCP
lists need to be refilled, the zone lock still needs to disable IRQs but
that will only happen on PCP refill and drain. If an IRQ is raised when
a PCP allocation is in progress, the trylock will fail and fallback to
using the buddy lists directly. Note that this may not be a universal win
if an interrupt-intensive workload also allocates heavily from interrupt
context and contends heavily on the zone->lock as a result.

Hi,

This patch caused the following warning. Please take a look.

Thanks.

WARNING: inconsistent lock state
6.0.0-dbg-DEV #1 Tainted: G S W O
--------------------------------

I finally found time to take a closer look at this and I cannot reproduce
it against 6.0. What workload triggered the warning, on what platform and
can you post the kernel config used please? It would also help if you
can remember what git commit the patch was tested upon.

Thanks and sorry for the long delay.

I didn't (try to) reproduce this, but FWIW the report looked legit to me, as after the patch, pcp_spin_trylock() has to be used for both allocation and freeing to be IRQ safe. free_unref_page() uses it, so it's fine. But as the stack trace in the report shows, free_unref_page_list() does pcp_spin_lock() and not _trylock, and that's IMHO the problem.