Re: [GIT PULL] kmalloc_obj treewide refactor for v7.0-rc1

From: Linus Torvalds

Date: Sat Feb 21 2026 - 17:33:46 EST


On Sat, 21 Feb 2026 at 12:16, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> I'm not sure this is a fatal issue, but it makes me *very* nervous.

This is definitely a fatal issue for this patch series.

I'm looking at the code generation for this thing, and this is what
register_netdevice_notifier() compiles into for me:

register_netdevice_notifier: # @register_netdevice_notifier
callq __fentry__
pushq %rbx
movq %rdi, %rbx
movq $pernet_ops_rwsem, %rdi
callq down_write
callq rtnl_lock
movq $netdev_chain, %rdi
movq %rbx, %rsi
callq raw_notifier_chain_register
movl %eax, %ebx
callq rtnl_unlock
movq $pernet_ops_rwsem, %rdi
callq up_write
movl %ebx, %eax
popq %rbx
retq

that looks nice and simple, but it's *way* too simple. The source code has this

/* Close race with setup_net() and cleanup_net() */
down_write(&pernet_ops_rwsem);

/* When RTNL is removed, we need protection for netdev_chain. */
rtnl_lock();

err = raw_notifier_chain_register(&netdev_chain, nb);
if (err)
goto unlock;
if (dev_boot_phase)
goto unlock;
for_each_net(net) {
__rtnl_net_lock(net);
...

and it appears that clang has decided that "dev_boot_phase" is always
1, so it just took that "goto unlock" path unconditionally, and
basically removed most of that function in the process.

And the *reason* it decided that seems to be that we have

static int dev_boot_phase = 1;

elsewhere in the same file, and it is only cleared in net_dev_init().

And in net_dev_init(), clang has decided that this code:

if (register_pernet_subsys(&netdev_net_ops))
goto out;

/*
* Initialise the packet receive queues.
*/

flush_backlogs_fallback = flush_backlogs_alloc();
if (!flush_backlogs_fallback)
goto out;

should result in this:

movq $netdev_net_ops, %rdi
callq register_pernet_subsys
testl %eax, %eax
jne .LBB277_6
# %bb.5:
#APP
.Ltmp1557:
ud2
.section __bug_table,"aw",@progbits

it it has decided that the flush_backlogs_alloc() call should result
in an unconditional WARN_ON(). The string associated with that
WARN_ON() is

overflows_flex_counter_type(typeof(struct flush_backlogs), w, __count)

which I guess is not surprising: that's exactly what that

kmalloc_flex(struct flush_backlogs, w, nr_cpu_ids, GFP_KERNEL);

change would do.

So in clang's world, the code that follows - that sets
'dev_boot_phase' to zero among other things - never happens, because
that kmalloc_flex() unconditionally returns NULL after the warning.

Which in turns is why objtool then complains about that

"register_netdevice() missing __noreturn"

because clang sees that

BUG_ON(dev_boot_phase);

and for all the same reasons thinks that that is always true and the
code always BUG()s out.

I do not think this is fixable. I complained about the overly
complicated macros earlier, and suggested you only do the minimal and
obvious kmalloc_obj() without any of the flex crap.

You decided that you need to do the complicated case too, and now that
case is too complicated for the compiler.

So I suspect the only option is to revert this all.

I'll look at this some more to see if there's something particular
about that *one* allocation, but I suspect the whole kmalloc_flex()
just has to die.

And next time I ask you to do only the simple thing, JUST DO IT.

Because right now your "hardening" is plain buggy garbage.

It's "hardening" only in the strict sense that a completely
non-bootable machine sure as hell is secure.

Linus