Re: Prevent inconsistent CPU state after sequence of dlclose/dlopen

From: Mathieu Desnoyers
Date: Wed Jan 15 2025 - 15:17:33 EST


On 2025-01-10 11:47, Adhemerval Zanella Netto wrote:


On 10/01/25 12:55, Mathieu Desnoyers wrote:
Hi,

I was discussing with Mark Rutland recently, and he pointed out that a
sequence of dlclose/dlopen mapping new code at the same addresses in
multithreaded environments is an issue on ARM, and possibly on Intel/AMD
with the newer TLB broadcast maintenance.

I maintain the membarrier(2) system call, which provides a
MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE command for this
purpose. It's been there since Linux 4.16. It can be configured
out (CONFIG_MEMBARRIER=n), but it's enabled by default.

Calling this after dlclose() in glibc would prevent this issue.

Is it handled in some other way, or should we open a bugzilla
entry to track this ?

Yes please, it would be helpful if you can add some information on
what kind of hardware and kernel version this is an issue.

Done:

https://sourceware.org/bugzilla/show_bug.cgi?id=32563


Also, could you add some detail of the issue and why kernel itself does
not or can not guarantee memory consistent after the mmap call?

I've added a comment detailing this.


Is is because this would be an extra non-required overhead on
mmap that userland should handle?

Yes, overhead is the culprit there, although it could be manageable
if we target this kind of extra sync-core operations on specific
sequences of mmap/munmap/mprotect with the PROT_EXEC flag.

I've documented a possible approach in the bugzilla entry.

Thanks,

Mathieu


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com