Re: [PATCH v4 1/2] arm64: Define Documentation/arm64/tagged-address-abi.txt

From: Szabolcs Nagy
Date: Thu Jun 13 2019 - 11:28:00 EST


On 13/06/2019 12:16, Vincenzo Frascino wrote:
> Hi Szabolcs,
>
> thank you for your review.
>
> On 13/06/2019 11:14, Szabolcs Nagy wrote:
>> On 13/06/2019 10:20, Catalin Marinas wrote:
>>> Hi Szabolcs,
>>>
>>> On Wed, Jun 12, 2019 at 05:30:34PM +0100, Szabolcs Nagy wrote:
>>>> On 12/06/2019 15:21, Vincenzo Frascino wrote:
>>>>> +2. ARM64 Tagged Address ABI
>>>>> +---------------------------
>>>>> +
>>>>> +From the kernel syscall interface prospective, we define, for the purposes
>>>> ^^^^^^^^^^^
>>>> perspective
>>>>
>>>>> +of this document, a "valid tagged pointer" as a pointer that either it has
>>>>> +a zero value set in the top byte or it has a non-zero value, it is in memory
>>>>> +ranges privately owned by a userspace process and it is obtained in one of
>>>>> +the following ways:
>>>>> + - mmap() done by the process itself, where either:
>>>>> + * flags = MAP_PRIVATE | MAP_ANONYMOUS
>>>>> + * flags = MAP_PRIVATE and the file descriptor refers to a regular
>>>>> + file or "/dev/zero"
>>>>
>>>> this does not make it clear if MAP_FIXED or other flags are valid
>>>> (there are many map flags i don't know, but at least fixed should work
>>>> and stack/growsdown. i'd expect anything that's not incompatible with
>>>> private|anon to work).
>>>
>>> Just to clarify, this document tries to define the memory ranges from
>>> where tagged addresses can be passed into the kernel in the context
>>> of TBI only (not MTE); that is for hwasan support. FIXED or GROWSDOWN
>>> should not affect this.
>>
>> yes, so either the text should list MAP_* flags that don't affect
>> the pointer tagging semantics or specify private|anon mapping
>> with different wording.
>>
>
> Good point. Could you please propose a wording that would be suitable for this case?

i don't know all the MAP_ magic, but i think it's enough to change
the "flags =" to

* flags have MAP_PRIVATE and MAP_ANONYMOUS set or
* flags have MAP_PRIVATE set and the file descriptor refers to...


>>>>> + - a mapping below sbrk(0) done by the process itself
>>>>
>>>> doesn't the mmap rule cover this?
>>>
>>> IIUC it doesn't cover it as that's memory mapped by the kernel
>>> automatically on access vs a pointer returned by mmap(). The statement
>>> above talks about how the address is obtained by the user.
>>
>> ok i read 'mapping below sbrk' as an mmap (possibly MAP_FIXED)
>> that happens to be below the heap area.
>>
>> i think "below sbrk(0)" is not the best term to use: there
>> may be address range below the heap area that can be mmapped
>> and thus below sbrk(0) and sbrk is a posix api not a linux
>> syscall, the libc can implement it with mmap or whatever.
>>
>> i'm not sure what the right term for 'heap area' is
>> (the address range between syscall(__NR_brk,0) at
>> program startup and its current value?)
>>
>
> I used sbrk(0) with the meaning of "end of the process's data segment" not
> implying that this is a syscall, but just as a useful way to identify the mapping.
> I agree that it is a posix function implemented by libc but when it is used with
> 0 finds the current location of the program break, which can be changed by brk()
> and depending on the new address passed to this syscall can have the effect of
> allocating or deallocating memory.
>
> Will changing sbrk(0) with "end of the process's data segment" make it more clear?

i don't understand what's the relevance of the *end*
of the data segment.

i'd expect the text to say something about the address
range of the data segment.

i can do

mmap((void*)65536, 65536, PROT_READ|PROT_WRITE, MAP_FIXED|MAP_SHARED|MAP_ANON, -1, 0);

and it will be below the end of the data segment.

>
> I will add what you are suggesting about the heap area.
>