Re: [GIT PULL] kmalloc_obj treewide refactor for v7.0-rc1
From: Linus Torvalds
Date: Sat Feb 21 2026 - 17:33:46 EST
On Sat, 21 Feb 2026 at 12:16, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> I'm not sure this is a fatal issue, but it makes me *very* nervous.
This is definitely a fatal issue for this patch series.
I'm looking at the code generation for this thing, and this is what
register_netdevice_notifier() compiles into for me:
register_netdevice_notifier: # @register_netdevice_notifier
callq __fentry__
pushq %rbx
movq %rdi, %rbx
movq $pernet_ops_rwsem, %rdi
callq down_write
callq rtnl_lock
movq $netdev_chain, %rdi
movq %rbx, %rsi
callq raw_notifier_chain_register
movl %eax, %ebx
callq rtnl_unlock
movq $pernet_ops_rwsem, %rdi
callq up_write
movl %ebx, %eax
popq %rbx
retq
that looks nice and simple, but it's *way* too simple. The source code has this
/* Close race with setup_net() and cleanup_net() */
down_write(&pernet_ops_rwsem);
/* When RTNL is removed, we need protection for netdev_chain. */
rtnl_lock();
err = raw_notifier_chain_register(&netdev_chain, nb);
if (err)
goto unlock;
if (dev_boot_phase)
goto unlock;
for_each_net(net) {
__rtnl_net_lock(net);
...
and it appears that clang has decided that "dev_boot_phase" is always
1, so it just took that "goto unlock" path unconditionally, and
basically removed most of that function in the process.
And the *reason* it decided that seems to be that we have
static int dev_boot_phase = 1;
elsewhere in the same file, and it is only cleared in net_dev_init().
And in net_dev_init(), clang has decided that this code:
if (register_pernet_subsys(&netdev_net_ops))
goto out;
/*
* Initialise the packet receive queues.
*/
flush_backlogs_fallback = flush_backlogs_alloc();
if (!flush_backlogs_fallback)
goto out;
should result in this:
movq $netdev_net_ops, %rdi
callq register_pernet_subsys
testl %eax, %eax
jne .LBB277_6
# %bb.5:
#APP
.Ltmp1557:
ud2
.section __bug_table,"aw",@progbits
it it has decided that the flush_backlogs_alloc() call should result
in an unconditional WARN_ON(). The string associated with that
WARN_ON() is
overflows_flex_counter_type(typeof(struct flush_backlogs), w, __count)
which I guess is not surprising: that's exactly what that
kmalloc_flex(struct flush_backlogs, w, nr_cpu_ids, GFP_KERNEL);
change would do.
So in clang's world, the code that follows - that sets
'dev_boot_phase' to zero among other things - never happens, because
that kmalloc_flex() unconditionally returns NULL after the warning.
Which in turns is why objtool then complains about that
"register_netdevice() missing __noreturn"
because clang sees that
BUG_ON(dev_boot_phase);
and for all the same reasons thinks that that is always true and the
code always BUG()s out.
I do not think this is fixable. I complained about the overly
complicated macros earlier, and suggested you only do the minimal and
obvious kmalloc_obj() without any of the flex crap.
You decided that you need to do the complicated case too, and now that
case is too complicated for the compiler.
So I suspect the only option is to revert this all.
I'll look at this some more to see if there's something particular
about that *one* allocation, but I suspect the whole kmalloc_flex()
just has to die.
And next time I ask you to do only the simple thing, JUST DO IT.
Because right now your "hardening" is plain buggy garbage.
It's "hardening" only in the strict sense that a completely
non-bootable machine sure as hell is secure.
Linus