Re: [RFC PATCH v8 09/10] context_tracking,x86: Defer kernel text patching IPIs when tracking CR3 switches

From: Valentin Schneider

Date: Wed Apr 22 2026 - 10:59:55 EST


On 15/04/26 14:11, Frederic Weisbecker wrote:
> Le Tue, Mar 24, 2026 at 10:48:00AM +0100, Valentin Schneider a écrit :
>> @@ -2706,11 +2708,29 @@ static void do_sync_core(void *info)
>> sync_core();
>> }
>>
>> +static void __smp_text_poke_sync_each_cpu(smp_cond_func_t cond_func)
>> +{
>> + on_each_cpu_cond(cond_func, do_sync_core, NULL, 1);
>> +}
>> +
>> void smp_text_poke_sync_each_cpu(void)
>> {
>> - on_each_cpu(do_sync_core, NULL, 1);
>> + __smp_text_poke_sync_each_cpu(NULL);
>> +}
>> +
>> +#ifdef CONFIG_TRACK_CR3
>> +static bool do_sync_core_defer_cond(int cpu, void *info)
>> +{
>> + return housekeeping_cpu(cpu, HK_TYPE_KERNEL_NOISE) ||
>> + per_cpu(kernel_cr3_loaded, cpu);
>
> || should be && ?
>

You almost got me there, but I think not :-)

cond_func() must return true for CPUs that need the IPI. In this case:
- the IPI is always sent to housekeeping CPUs
- the IPI may be sent to isolated CPUs, if they have the kernel CR3 loaded
(IOW accessing kernel stuff, no longer accessing pure userspace faff).

> Also I would again expect full ordering here with an smp_mb() before the
> check. So that:
>
> CPU 0 CPU 1
> ----- -----
> //enter_kernel //do_sync_core_defer_cond
> kernel_cr3_loaded = 1 WRITE page table
> smp_mb() smp_mb()
> WRITE cr3 READ kernel_cr3_loaded
>
> But I'm not sure if that ordering is enough to imply that if CPU 1 observes
> kernel_cr3_loaded == 0, then subsequent CPU 0 entering the kernel is guaranteed
> to flush the TLB with the latest page table write.
>

For TLB faff that'll be flush_tlb_kernel_cond() but same logic applies. And
now that you point it out, I think the smp_mb() (or LOCK prefix on the asm
part) you're suggesting should be enough, but let me ponder on this some
more.

> Thoughts?
>
> Thanks.
>
> --
> Frederic Weisbecker
> SUSE Labs