Re: [PATCH v4 17/21] mm/mmap: Drop arch_unmap() call from all archs

From: LEROY Christophe
Date: Thu Jul 11 2024 - 04:30:09 EST




Le 11/07/2024 à 01:26, Liam R. Howlett a écrit :
> * LEROY Christophe <christophe.leroy2@xxxxxxxxxxxxxxxxxx> [240710 17:02]:
>>
>>
>> Le 10/07/2024 à 21:22, Liam R. Howlett a écrit :
>>> From: "Liam R. Howlett" <Liam.Howlett@xxxxxxxxxx>
>>>
>>> The arch_unmap call was previously moved above the rbtree modifications
>>> in commit 5a28fc94c914 ("x86/mpx, mm/core: Fix recursive munmap()
>>> corruption"). The move was motivated by an issue with calling
>>> arch_unmap() after the rbtree was modified.
>>>
>>> Since the above commit, mpx was dropped from the kernel in 45fc24e89b7c
>>> ("x86/mpx: remove MPX from arch/x86"), so the motivation for calling
>>> arch_unmap() prior to modifying the vma tree no longer exists
>>> (regardless of rbtree or maple tree implementations).
>>>
>>> Furthermore, the powerpc implementation is also no longer needed as per
>>> [1] and [2]. So the arch_unmap() function can be completely removed.
>>
>> I'm not sure to understand. Is it replaced by something else ?
>> We wanted to get rid of arch_unmap() but it was supposed to be replaced
>> by some core function because the functionnality itself is still
>> required and indeed all the discussion around [2] demonstrated that not
>> only powerpc but at least arm and probably others needed to properly
>> clean-up reference to VDSO mappings on unmapping.
>>
>> So as mentioned by Michael you can't just drop that without replacing it
>> by something else. We need the VDSO signal handling to properly fallback
>> on stack-based trampoline when the VDSO trampoline gets mapped out.
>
> I'll address this after the part I missed..

After ? What do you mean ? It needs to be addressed _before_ removing
arch_unmap()

>
>>
>> Or did I miss something ?
>>
>
> I think I missed something in regards to what you need in ppc.

It is not only powerpc. Powerpc is the only one doing it at the moment
but investigation has demonstrated that other architectures are affected.

>
> From what I understand, other platforms still map and use the vdso
> (context.vdso is set), but unmap_arch() does nothing. It is only the
> powerpc version that clears the vdso pointer if it is unmapped.

Yes on powerpc it works. On other platforms like arm it segfaults so it
should be fixed
(https://lore.kernel.org/lkml/87imd5h5kb.fsf@xxxxxxxxxxxxxxxxxx/)

Could be fixed by properly implementing arch_unmap() on every arch, or
carry-on with Dmitry's series.

>
> git grep -w arch_unmap shows:
> arch/powerpc/include/asm/mmu_context.h
> arch/x86/include/asm/mmu_context.h
> include/asm-generic/mm_hooks.h
> mm/mmap.c
>
> The generic and x86 versions are empty.
>
> From the patch set you referenced, we see changes related to the files
> modified, but I don't think any of them did anything with unmap_arch().

In the v3 series from Dmitry, [PATCH v3 16/23] mm: Add vdso_base in
mm_struct
(https://lore.kernel.org/all/20210611180242.711399-17-dima@xxxxxxxxxx/)
it is done via special_mapping_close()

>
> arm: a0d2fcd62ac2 ("vdso/ARM: Make union vdso_data_store available for all architectures")
> arm64: d0fba04847ae ("arm64: vdso: Use generic union vdso_data_store")
> mips: d697a9997a0d ("MIPS: vdso: Use generic union vdso_data_store")
> s390: cb3444cfdb48 ("s390/vdso: Use generic union vdso_data_store")
> riscv: eba755314fa7 ("riscv: vdso: Use generic union vdso_data_store")
>
> ia64 is dead
> nds32 is dead
> hexagon has a bunch of vdso work in the logs as well.
>
> There is also a6c19dfe3994 ("arm64,ia64,ppc,s390,sh,tile,um,x86,mm: remove default gate area")
>
> I do not see sparc changing away from what the patches were doing, but
> again, the arch_unmap() seems to do nothing there as well.
>
> So, what I was looking to do is to avoid a call to arch specific
> functions that does nothing but set the vdso pointer to NULL for
> powerpc.

That's what is doing Dmitry's series, removing arch_unmap() and replace
it with core handling. The advantage being that it addresses it for all
affected architectures, improving the current situation.

>
> The thread referenced in the git bug [1] seems to indicate this is for
> CRIU unmapping/restoring a task, but CRIU now just moves the vdso
> mapping (or just works on ppc at this point?). Since [2] hasn't landed,
> isn't this still broken for CRIU on powerpc as it is?
>
> So, are we keeping the unmap_arch() function around, which has errors
> that were never fixed, for a single application that utilizes a newer
> method of moving the vdso anyways?

Again, we want to remove arch_unmap() but we want the core-mm to handle
it instead.

>
> On the note of CRIU, it seems it cannot handle tasks which don't have
> the vdso mapped anymore [3], so setting it to NULL is probably a bad
> plan even for that one application?

But as mentioned by Dmitry it is not only CRIU. There has also been
issues with Valgrind.

>
>
> So, I think this just leaves the fallback when the VDSO is unmapped..
> Well, it seems what people have been doing is unmap the vdso to stop
> these functions from working [4]. At least this is what some users are
> doing. The ability to replace this vma with a guard vma leads me to
> believe that other archs don't fall back at all - please correct me if
> I'm wrong!
>
> I also cannot find any reference to other archs clearing the
> context.vdso (aside from failures in __setup_additional_pages).
>
> But maybe I don't fully understand how this works?

I think you fully understand that it doesn't work as it is except on
powerpc. Again the goal should be to make it work on all architectures.

Thanks
Christophe

>
> Thanks,
> Liam
>
>
> [1] https://lore.kernel.org/lkml/87d0lht1c0.fsf@xxxxxxxxxxxxxxxxxxxxxxxx/
> [2] https://lore.kernel.org/lkml/9c2b2826-4083-fc9c-5a4d-c101858dd560@xxxxxxxxxxxxxxxxxx/
> [3] https://github.com/checkpoint-restore/criu/issues/488
> [4] https://github.com/insanitybit/void-ship
>
> Thanks,
> Liam
>
>