Re: [PATCH v8 2/8] powerpc/vdso: Remove __kernel_datapage_offset and simplify __get_datapage()

From: Michael Ellerman
Date: Wed Aug 26 2020 - 10:07:41 EST


Christophe Leroy <christophe.leroy@xxxxxxxxxx> writes:
> Le 04/08/2020 à 13:17, Christophe Leroy a écrit :
>> On 07/16/2020 02:59 AM, Michael Ellerman wrote:
>>> Christophe Leroy <christophe.leroy@xxxxxx> writes:
>>>> The VDSO datapage and the text pages are always located immediately
>>>> next to each other, so it can be hardcoded without an indirection
>>>> through __kernel_datapage_offset
>>>>
>>>> In order to ease things, move the data page in front like other
>>>> arches, that way there is no need to know the size of the library
>>>> to locate the data page.
>>>>
> [...]
>
>>>
>>> I merged this but then realised it breaks the display of the vdso in
>>> /proc/self/maps.
>>>
>>> ie. the vdso vma gets no name:
>>>
>>>    # cat /proc/self/maps
>
> [...]
>
>>> And it's also going to break the logic in arch_unmap() to detect if
>>> we're unmapping (part of) the VDSO. And it will break arch_remap() too.
>>>
>>> And the logic to recognise the signal trampoline in
>>> arch/powerpc/perf/callchain_*.c as well.
>>
>> I don't think it breaks that one, because ->vdsobase is still the start
>> of text.
>>
>>>
>>> So I'm going to rebase and drop this for now.
>>>
>>> Basically we have a bunch of places that assume that vdso_base is == the
>>> start of the VDSO vma, and also that the code starts there. So that will
>>> need some work to tease out all those assumptions and make them work
>>> with this change.
>>
>> Ok, one day I need to look at it in more details and see how other
>> architectures handle it etc ...
>>
>
> I just sent out a series which switches powerpc to the new
> _install_special_mapping() API, the one powerpc uses being deprecated
> since commit a62c34bd2a8a ("x86, mm: Improve _install_special_mapping
> and fix x86 vdso naming")
>
> arch_remap() gets replaced by vdso_remap()
>
> For arch_unmap(), I'm wondering how/what other architectures do, because
> powerpc seems to be the only one to erase the vdso context pointer when
> unmapping the vdso.

Yeah. The original unmap/remap stuff was added for CRIU, which I thought
people tested on other architectures (more than powerpc even).

Possibly no one really cares about vdso unmap though, vs just moving the
vdso.

We added a test for vdso unmap recently because it happened to trigger a
KAUP failure, and someone actually hit it & reported it.

Running that test on arm64 segfaults:

# ./sigreturn_vdso
VDSO is at 0xffff8191f000-0xffff8191ffff (4096 bytes)
Signal delivered OK with VDSO mapped
VDSO moved to 0xffff8191a000-0xffff8191afff (4096 bytes)
Signal delivered OK with VDSO moved
Unmapped VDSO
Remapped the stack executable
[ 48.556191] potentially unexpected fatal signal 11.
[ 48.556752] CPU: 0 PID: 140 Comm: sigreturn_vdso Not tainted 5.9.0-rc2-00057-g2ac69819ba9e #190
[ 48.556990] Hardware name: linux,dummy-virt (DT)
[ 48.557336] pstate: 60001000 (nZCv daif -PAN -UAO BTYPE=--)
[ 48.557475] pc : 0000ffff8191a7bc
[ 48.557603] lr : 0000ffff8191a7bc
[ 48.557697] sp : 0000ffffc13c9e90
[ 48.557873] x29: 0000ffffc13cb0e0 x28: 0000000000000000
[ 48.558201] x27: 0000000000000000 x26: 0000000000000000
[ 48.558337] x25: 0000000000000000 x24: 0000000000000000
[ 48.558754] x23: 0000000000000000 x22: 0000000000000000
[ 48.558893] x21: 00000000004009b0 x20: 0000000000000000
[ 48.559046] x19: 0000000000400ff0 x18: 0000000000000000
[ 48.559180] x17: 0000ffff817da300 x16: 0000000000412010
[ 48.559312] x15: 0000000000000000 x14: 000000000000001c
[ 48.559443] x13: 656c626174756365 x12: 7865206b63617473
[ 48.559625] x11: 0000000000000003 x10: 0101010101010101
[ 48.559828] x9 : 0000ffff818afda8 x8 : 0000000000000081
[ 48.559973] x7 : 6174732065687420 x6 : 64657070616d6552
[ 48.560115] x5 : 000000000e0388bd x4 : 000000000040135d
[ 48.560270] x3 : 0000000000000000 x2 : 0000000000000001
[ 48.560412] x1 : 0000000000000003 x0 : 00000000004120b8
Segmentation fault
#

So I think we need to keep the unmap hook. Maybe it should be handled by
the special_mapping stuff generically.

cheers