Re: [RESEND PATCH v5 12/12] x86/mm: Enable preemption during flush_tlb_kernel_range
From: Chuyi Zhou
Date: Mon May 25 2026 - 23:25:51 EST
On 2026-05-22 6:48 p.m., Sebastian Andrzej Siewior wrote:
> On 2026-05-13 20:45:24 [+0800], Chuyi Zhou wrote:
>> flush_tlb_kernel_range() is invoked when kernel memory mapping changes.
>> On x86 platforms without the INVLPGB feature enabled, we need to send IPIs
>> to every online CPU and synchronously wait for them to complete
>> do_kernel_range_flush(). This process can be time-consuming due to factors
>> such as a large number of CPUs or other issues (like interrupts being
>> disabled). flush_tlb_kernel_range() always disables preemption, this may
>> affect the scheduling latency of other tasks on the current CPU.
>>
>> Previous patch converted flush_tlb_info from per-cpu variable to on-stack
>> variable. Additionally, it's no longer necessary to explicitly disable
>> preemption before calling smp_call*() since they internally handles the
>> preemption logic. Now it's safe to enable preemption during
>> flush_tlb_kernel_range(). Additionally, in get_flush_tlb_info() use
>> raw_smp_processor_id() to avoid warnings from check_preemption_disabled().
>
> This is a bit odd. That smp_processor_id() is there to catch users with
> enabled CPU migration. The only reason is the accounting done in
> flush_tlb_func(). This raw_smp_processor_id() is only needed in
> flush_tlb_kernel_range() which does not call flush_tlb_func(). This is
> only statistics.
>
> kernel_tlb_flush_all() does not need info at all.
> kernel_tlb_flush_range() needs only start and end.
>
Agreed, I see the concern about changing get_flush_tlb_info() to use
raw_smp_processor_id(). The smp_processor_id() check is useful for the
mm TLB flush paths.
For flush_tlb_kernel_range(), however, the kernel range path only needs
start/end, and the full kernel flush case does not need flush_tlb_info
at all. One possible way to address this is:
- keep get_flush_tlb_info() using smp_processor_id();
- factor the range-to-full-flush decision out of get_flush_tlb_info(),
so flush_tlb_kernel_range() can reuse that logic without building a
full flush_tlb_info;
- make flush_tlb_kernel_range() use a small kernel-only range
descriptor containing only start/end for the range case;
- make kernel_tlb_flush_all() take no flush_tlb_info argument.
That would avoid weakening the smp_processor_id() debug check and make
the kernel range path use only the data it actually needs.
That said, this does add some extra churn to this patch. Since the main
goal of the series is to enable preemption during
flush_tlb_kernel_range(), I am also fine with keeping this series
smaller if the x86 maintainers prefer, and doing the
kernel-range/flush_tlb_info cleanup as a seperate follow-up patch.
> Oh well.
>
>> Signed-off-by: Chuyi Zhou <zhouchuyi@xxxxxxxxxxxxx>
>
> Sebastian