Re: [PATCH 1/1] mm/page_alloc: Leave IRQs enabled for per-cpu page allocations

From: Yu Zhao
Date: Mon Oct 10 2022 - 18:09:58 EST


On Mon, Oct 10, 2022 at 2:45 PM Vlastimil Babka <vbabka@xxxxxxx> wrote:
>
> On 10/10/22 16:22, Mel Gorman wrote:
> > On Wed, Aug 24, 2022 at 10:58:26PM -0600, Yu Zhao wrote:
> >> On Wed, Aug 24, 2022 at 8:18 AM Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> wrote:
> >>>
> >>> The pcp_spin_lock_irqsave protecting the PCP lists is IRQ-safe as a task
> >>> allocating from the PCP must not re-enter the allocator from IRQ context.
> >>> In each instance where IRQ-reentrancy is possible, the lock is acquired using
> >>> pcp_spin_trylock_irqsave() even though IRQs are disabled and re-entrancy
> >>> is impossible.
> >>>
> >>> Demote the lock to pcp_spin_lock avoids an IRQ disable/enable in the common
> >>> case at the cost of some IRQ allocations taking a slower path. If the PCP
> >>> lists need to be refilled, the zone lock still needs to disable IRQs but
> >>> that will only happen on PCP refill and drain. If an IRQ is raised when
> >>> a PCP allocation is in progress, the trylock will fail and fallback to
> >>> using the buddy lists directly. Note that this may not be a universal win
> >>> if an interrupt-intensive workload also allocates heavily from interrupt
> >>> context and contends heavily on the zone->lock as a result.
> >>
> >> Hi,
> >>
> >> This patch caused the following warning. Please take a look.
> >>
> >> Thanks.
> >>
> >> WARNING: inconsistent lock state
> >> 6.0.0-dbg-DEV #1 Tainted: G S W O
> >> --------------------------------
> >
> > I finally found time to take a closer look at this and I cannot reproduce
> > it against 6.0. What workload triggered the warning, on what platform and
> > can you post the kernel config used please? It would also help if you
> > can remember what git commit the patch was tested upon.
> >
> > Thanks and sorry for the long delay.
>
> I didn't (try to) reproduce this, but FWIW the report looked legit to
> me, as after the patch, pcp_spin_trylock() has to be used for both
> allocation and freeing to be IRQ safe. free_unref_page() uses it, so
> it's fine. But as the stack trace in the report shows,
> free_unref_page_list() does pcp_spin_lock() and not _trylock, and that's
> IMHO the problem.

If this is not the case, please let me know and I'll try repro again.