Re: [PATCH 3/3] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to CPUs in kernel mode

From: Marcelo Tosatti
Date: Thu Apr 06 2023 - 09:17:44 EST


On Wed, Apr 05, 2023 at 09:52:26PM +0200, Peter Zijlstra wrote:
> On Wed, Apr 05, 2023 at 04:45:32PM -0300, Marcelo Tosatti wrote:
> > On Wed, Apr 05, 2023 at 01:10:07PM +0200, Frederic Weisbecker wrote:
> > > On Wed, Apr 05, 2023 at 12:44:04PM +0200, Frederic Weisbecker wrote:
> > > > On Tue, Apr 04, 2023 at 04:42:24PM +0300, Yair Podemsky wrote:
> > > > > + int state = atomic_read(&ct->state);
> > > > > + /* will return true only for cpus in kernel space */
> > > > > + return state & CT_STATE_MASK == CONTEXT_KERNEL;
> > > > > +}
> > > >
> > > > Also note that this doesn't stricly prevent userspace from being interrupted.
> > > > You may well observe the CPU in kernel but it may receive the IPI later after
> > > > switching to userspace.
> > > >
> > > > We could arrange for avoiding that with marking ct->state with a pending work bit
> > > > to flush upon user entry/exit but that's a bit more overhead so I first need to
> > > > know about your expectations here, ie: can you tolerate such an occasional
> > > > interruption or not?
> > >
> > > Bah, actually what can we do to prevent from that racy IPI? Not much I fear...
> >
> > Use a different mechanism other than an IPI to ensure in progress
> > __get_free_pages_fast() has finished execution.
> >
> > Isnt this codepath slow path enough that it can use
> > synchronize_rcu_expedited?
>
> To actually hit this path you're doing something really dodgy.

Apparently khugepaged is using the same infrastructure:

$ grep tlb_remove_table khugepaged.c
tlb_remove_table_sync_one();
tlb_remove_table_sync_one();

So just enabling khugepaged will hit that path.