Re: [PATCH v2 11/11] mm,sched: conditionally skip lazy TLB mm refcounting

From: Andy Lutomirski
Date: Mon Jul 30 2018 - 18:00:47 EST




> On Jul 30, 2018, at 2:46 PM, Rik van Riel <riel@xxxxxxxxxxx> wrote:
>
>> On Mon, 2018-07-30 at 12:49 -0700, Andy Lutomirski wrote:
>>
>>
>> I think it's a big step in the right direction, but it still makes be
>> nervous. I'd be more comfortable with it if you at least had a
>> functional set of patches that result in active_mm being gone,
>> because
>> that will mean that you actually audited the whole mess and fixed
>> anything that might rely on active_mm pointing somewhere or that
>> might
>> be putting a value you didn't take into account into active_mm. IOW
>> I'm not totally thrilled by applying the patches as is if we're still
>> a bit unsure as to what might have gotten missed.
>>
>> I don't think it's at all necessary to redo the patches.
>>
>> Does that seem reasonable?
>
> Absolutely. I tried to keep ->active_mm very similar
> to before for exactly that reason.
>
> Lets go through all the places where it is used, in
> x86 and architecture independent code. I have not
> checked other architectures.
>
> It looks like we should be able to get rid of
> ->active_mm at some point, but a lot of it depends
> on other architecture maintainers.
>
>
> arch/x86/events/core.c:
> - get_segment_base: get current->active_mm->context.ldt,
> this appears to be for TIF_IA32 user programs only, so
> we should be able to use current->mm here

->mm sounds more correct anyway

>
> arch/x86/kernel/cpu/common.c:
> - current task's ->active_mm assigned in two places,
> never read
>
> arch/x86/lib/insn-eval.c:
> - get_desc() gets current->active_mm->context.ldt, this
> appears to be only for user space programs

Same as above

>
> arch/x86/mm/tlb.c:
> - this series adds two places where current->active_mm is
> written, it is never read
>
> arch/x86/platform/efi/efi_64.c:
> - current->active_mm is set to efi_mm for a little bit,
> with irqs disabled, and then changed back, with irqs still
> disabled; we should be able to get rid of ->active_mm here
> - in the init code, ->active_mm is set to efi_mm as well,
> presumably the kernel automatically switches that back on
> the next context switch; this may be buggy, since preemption
> is enabled and a GFP_KERNEL allocation is just a few lines
> below

Ick. This should mostly go away soon â most EFI code will move to a real thread.

>
> arch/x86/power/cpu.c:
> - fix_processor_context() calls load_mm_ldt(current->active_mm);,
> we should be able to use cpu_tlbstate.loaded_mm instead

Agreed

>
> drivers/cpufreq/pmac32-cpufreq.c:
> - pmu_set_cpu_speed() restores current->active_mm - don't know if
> anyone still cares about 32 bit PPC :)
>

I think we should only remove active_mm when the new #define is set. So this doesnât need to change.

> drivers/firmware/efi/arm-runtime.c:
> - efi_virtmap_unload switches back the pgd to current->active_mm
> from &efi_mm; that mm could be stored elsewhere if we excised
> ->active_mm everywhere

Ditto

IOW the active_mm refcounting may be genuinely useful for architectures that are not able to efficiently shoot down remote lazy mm references in exit_mmap(). I suspect that ARM64 may be in that category.