Re: [BUG] WARNING in workingset_activation triggered by KVM page fault path on Linux 7.0.0-08391-g1d51b370a0f8

From: Shakeel Butt

Date: Wed Apr 22 2026 - 11:54:30 EST


On Wed, Apr 22, 2026 at 06:01:44AM -0700, Sean Christopherson wrote:
> On Wed, Apr 22, 2026, David Hildenbrand (Arm) wrote:
> > On 4/22/26 04:06, Zw Tang wrote:
> > > Hi David,
> > >
> > > Thanks for pointing this out.
> > >
> > > You are right. The commit id I sent was incorrect. I mistakenly used the
> > > git describe-style suffix g1d51b370a0f8, but the actual git commit is:
> > >
> > > 1d51b370a0f8f642f4fc84c795fbedac0fcdbbd2
> > >
> > > The short commit id is:
> > >
> > > 1d51b370a0f8
> > >
> > > Sorry for the confusion.
> > >
> > > I am also re-checking whether the kernel image was built from a clean tree
> > > and whether there were any local modifications when the crash was reproduced,
> > > so that the reported source line numbers match the exact build.
> >
> > Okay, on that tree include/linux/memcontrol.h:381 points at
> >
> > lockdep_assert_once(rcu_read_lock_held() ||
> > lockdep_is_held(&cgroup_mutex));
> >
> > lockdep_is_held() would not trigger a warning like that IIRC, but
> >
> > lockdep_assert_once() does
> >
> > do { WARN_ON_ONCE(debug_locks && !(cond)); } while (0)
> >
> >
> > So likely we are calling obj_cgroup_memcg() without the RCU read lock held?
> >
> >
> > kvm_release_page_clean()->kvm_set_page_accessed()->mark_page_accessed()->folio_mark_accessed()->workingset_activation()
> >
> > ... grabs the RCU lock, though, before calling
> >
> > rcu_read_lock();
> > workingset_age_nonresident(folio_lruvec(folio), folio_nr_pages(folio));
> > rcu_read_unlock();
>
> No? Since commit 906c38ff52e9 ("memcg: workingset: remove folio_memcg_rcu usage"),
> I see:
>
> void workingset_activation(struct folio *folio)
> {
> /*
> * Filter non-memcg pages here, e.g. unmap can call
> * mark_page_accessed() on VDSO pages.
> */
> if (mem_cgroup_disabled() || folio_memcg_charged(folio))
> workingset_age_nonresident(folio_lruvec(folio), folio_nr_pages(folio));
> }
>
> But for the life of me, I can't figure out how obj_cgroup_memcg() is being reached,
> and I haven't been able to reproduce the splat to add instrumentation (though I
> haven't tried very hard).

folio_lruvec() -> folio_memcg() -> obj_cgroup_memcg() if folio_memcg_kmem()

How is the given folio (page) is allocated?