Re: [PATCH 3/3] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to CPUs in kernel mode

From: Marcelo Tosatti
Date: Wed Apr 05 2023 - 15:46:55 EST


On Wed, Apr 05, 2023 at 12:43:58PM +0200, Frederic Weisbecker wrote:
> On Tue, Apr 04, 2023 at 04:42:24PM +0300, Yair Podemsky wrote:
> > @@ -191,6 +192,20 @@ static void tlb_remove_table_smp_sync(void *arg)
> > /* Simply deliver the interrupt */
> > }
> >
> > +
> > +#ifdef CONFIG_CONTEXT_TRACKING
> > +static bool cpu_in_kernel(int cpu, void *info)
> > +{
> > + struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu);
>
> Like Peter said, an smp_mb() is required here before the read (unless there is
> already one between the page table modification and that ct->state read?).
>
> So that you have this pairing:
>
>
> WRITE page_table WRITE ct->state
> smp_mb() smp_mb() // implied by atomic_fetch_or
> READ ct->state READ page_table
>
> > + int state = atomic_read(&ct->state);
> > + /* will return true only for cpus in kernel space */
> > + return state & CT_STATE_MASK == CONTEXT_KERNEL;
> > +}
>
> Also note that this doesn't stricly prevent userspace from being interrupted.
> You may well observe the CPU in kernel but it may receive the IPI later after
> switching to userspace.
>
> We could arrange for avoiding that with marking ct->state with a pending work bit
> to flush upon user entry/exit but that's a bit more overhead so I first need to
> know about your expectations here, ie: can you tolerate such an occasional
> interruption or not?

Two points:

1) For a virtualized system, the overhead is not only of executing the
IPI but:

VM-exit
run VM-exit code in host
handle IPI
run VM-entry code in host
VM-entry

2) Depends on the application and the definition of "occasional".

For certain types of applications (for example PLC software or
RAN processing), upon occurrence of an event, it is necessary to
complete a certain task in a maximum amount of time (deadline).

One way to express this requirement is with a pair of numbers,
deadline time and execution time, where:

* deadline time: length of time between event and deadline.
* execution time: length of time it takes for processing of event
to occur on a particular hardware platform
(uninterrupted).