Re: [GIT PULL] locking/urgent for v6.17-rc1
From: Sean Christopherson
Date: Thu Aug 21 2025 - 15:46:03 EST
On Thu, Aug 21, 2025, Linus Torvalds wrote:
> On Sat, 9 Aug 2025 at 14:02, Borislav Petkov <bp@xxxxxxxxx> wrote:
> >
> > please pull a locking/urgent fix for v6.17-rc1.
>
> Ok, so this clearly wasn't a fix.
>
> > Thomas Gleixner (1):
> > futex: Move futex cleanup to __mmdrop()
>
> So this causes problems, because __mmdrop is not done in thread
> context, and the kvfree() calls then cause issues:
>
> https://lore.kernel.org/all/20250821102721.6deae493@xxxxxxxxxx/
> https://lore.kernel.org/all/20250818131902.5039-1-hdanton@xxxxxxxx/
>
> Hilf Danton sent out a patch, but honestly, that patch looks like pure
> bandaid, and will make the exit path horribly much slower by moving
> things into workqueues. It might not be visible in profiles exactly
> *because* it's then hidden in workqueues, but it's not great.
>
> I think it's a mistake to allow vmalloc'ing those hashes in the first
> place, and I suggest the local hash be size-limited to the point where
> it's just a kmalloc() and thus works in all contexts.
>
> Or maybe the mistake was the mm-private hashing in the first place.
> Maybe that hash shouldn't be allocated at mm_alloc() ->
> futex_mm_init() at all. Only initialized by the futex code when
> needed, and then dropped in exit_mmap().
>
> So the problems seem deeper than just "free'd in the wrong context".
Piggybacking the futex private hashing attention, the new fanciness is causing
crashes in my testing. The crashes are 100% reproducible, but my reproducer is
simply running a variety of tests in parallel, i.e. isn't very debug-friendly,
and the code itself is black magic to me, so all I've done is bisect.
I reported the issue on the original thread, but haven't seen any follow-up.
https://lore.kernel.org/all/aJ_vEP2EHj6l0xRT@xxxxxxxxxx