Re: [RFC v2 4/4] vmalloc_exec: share a huge page with kernel text

From: Song Liu
Date: Wed Oct 12 2022 - 15:01:52 EST




> On Oct 12, 2022, at 11:38 AM, Edgecombe, Rick P <rick.p.edgecombe@xxxxxxxxx> wrote:
>
> On Wed, 2022-10-12 at 05:37 +0000, Song Liu wrote:
>>> Then you have code that operates on module text like:
>>> if (is_vmalloc_or_module_addr(addr))
>>> pfn = vmalloc_to_pfn(addr);
>>>
>>> It looks like it would work (on x86 at least). Should it be
>>> expected
>>> to?
>>>
>>> Especially after this patch, where there is memory that isn't even
>>> tracked by the original vmap_area trees, it is pretty much a
>>> separate
>>> allocator. So I think it might be nice to spell out which other
>>> vmalloc
>>> APIs work with these new functions since they are named "vmalloc".
>>> Maybe just say none of them do.
>>
>> I guess it is fair to call this a separate allocator. Maybe
>> vmalloc_exec is not the right name? I do think this is the best
>> way to build an allocator with vmap tree logic.
>
> Yea, I don't know about the name. I think someone else suggested it
> specifically, right?

I think Luis suggested rename module_alloc to vmalloc_exec. But I
guess we still need module_alloc for module data allocations.

>
> I had called mine perm_alloc() so it could also handle read-only and
> other permissions.

What are other permissions that we use? We can probably duplicate
the free_text_are_ tree logic for other cases.


> If you keep vmalloc_exec() it needs some big
> comments about which APIs can work with it, and an audit of the
> existing code that works on module and JIT text.
>
>>
>>>
>>>
>>> Separate from that, I guess you are planning to make this limited
>>> to
>>> certain architectures? It might be better to put logic with
>>> assumptions
>>> about x86 boot time page table details inside arch/x86 somewhere.
>>
>> Yes, the architecture need some text_poke mechanism to use this.
>
> It also depends on the space between _etext and the PMD aligned _etext
> to be present and not get used by anything else. For other
> architectures, there might be rodata there or other things.

Good point! We need to make sure this part is not used by other things.

>
>> On BPF side, x86_64 calls this directly from arch code (jit engine),
>> so it is mostly covered. For modules, we need to handle this better.
>
> That old RFC has some ideas around this. I kind of like your
> incremental approach though. To me it seems to be moving in the right
> direction.

Thanks!
Song