Re: Prevent inconsistent CPU state after sequence of dlclose/dlopen
From: Florian Weimer
Date: Fri Jan 10 2025 - 12:46:29 EST
* Mathieu Desnoyers:
> On 2025-01-10 12:10, Florian Weimer wrote:
>> * Mathieu Desnoyers:
>>
>>> On 2025-01-10 11:54, Peter Zijlstra wrote:
>>>> On Fri, Jan 10, 2025 at 10:55:36AM -0500, Mathieu Desnoyers wrote:
>>>>> Hi,
>>>>>
>>>>> I was discussing with Mark Rutland recently, and he pointed out that a
>>>>> sequence of dlclose/dlopen mapping new code at the same addresses in
>>>>> multithreaded environments is an issue on ARM, and possibly on Intel/AMD
>>>>> with the newer TLB broadcast maintenance.
>>>> What is the exact race? Should not munmap() invalidate the TLBs
>>>> before
>>>> it allows overlapping mmap() to complete?
>>>
>>> The race Mark mentioned (on ARM) is AFAIU the following scenario:
>>>
>>> CPU 0 CPU 1
>>>
>>> - dlopen()
>>> - mmap PROT_EXEC @addr
>>> - fetch insn @addr, CPU state expects unchanged insn.
>>> - execute unrelated code
>>> - dlclose(addr)
>>> - munmap @addr
>>> - dlopen()
>>> - mmap PROT_EXEC @addr
>>> - fetch new insn @addr. Incoherent CPU state.
>> Unmapping an object while code is executing in it is undefined.
>
> That's not the scenario though. In this scenario, CPU 1 executes
> _unrelated code_ while we unmap @addr.
Oh, so CPU 1 initially executes some code, returns to some safe,
persistent code (“the execute unrelated code” part), this code
synchronizes with the dlclose and the dlopen that execute on CPU 0,
obtains a pointer to some supposedly safely published function in the
newly mapped object, and calls it. And that fails because previously
cached information about the code is invalid?
Additional awkwardness may result if the initial execution is
speculative, and the code on CPU 1 only synchronizes with the dlopen,
and not the previous dlclose because it does not know about it at all?
Thanks,
Florian