Re: BUG: unable to handle kernel NULL pointer dereference in fdb_find_rcu

From: Nikolay Aleksandrov
Date: Sat Dec 16 2017 - 05:41:33 EST


On 16/12/17 11:29, Nikolay Aleksandrov wrote:
> On 16/12/17 11:17, Nikolay Aleksandrov wrote:
>> On 16/12/17 02:37, Andrei Vagin wrote:
>>> Hi,
>>>
>>> We run criu tests for linux-next and today we get this bug:
>>>
>>> The kernel version is 4.15.0-rc3-next-20171215
>>>
>>> [ 235.397328] BUG: unable to handle kernel NULL pointer dereference
>>> at 000000000000000c
>>> [ 235.398624] IP: fdb_find_rcu+0x3c/0x130
>> [snip]
>>
>> Hi,
>> Thanks for the report, I've missed the changelink before dev creation case when I did
>
> err, s/changelink/br_stp_change_bridge_id/
> the other options are set after register_netdevice, this is the only one changed before
>
>> the rhashtable conversion, some of the options do fdb lookups as part of their routine
>> but we don't have the table initialized yet at that point.
>> I'll send a fix after some testing.
>>
>> Thanks,
>> Nik
>>
>>
>

We need to fix this in -net, it has a memory leak that has existed since the
introduction of br_stp_change_bridge_id() before register_netdevice because
it adds an fdb entry which never gets deleted if an error happens, also the
notifications for that fdb entry come with ifindex = 0 because the bridge netdev
doesn't exist yet. All of that looks wrong, I'll send a fix for -net to move
the bridge id change after the netdev register and cleanup any bridge fdbs
on error.

The commit with that change is:
30313a3d5794 ("bridge: Handle IFLA_ADDRESS correctly when creating bridge device")
Before the changelink in while doing newlink in bridge was possible, this would happen
only on netdev register fail, but now it is much easier to trigger (as below) since
changelink can fail if called with wrong arguments.

Here's the trace of rmmod bridge after a failed bridge newlink with mac address set
(this kernel is before my rhashtable change):
$ ip l add br0 address 00:11:22:33:44:55 type bridge group_fwd_mask 1
RTNETLINK answers: Invalid argument
$ rmmod bridge
[ 1822.142525] =============================================================================
[ 1822.143640] BUG bridge_fdb_cache (Tainted: G O ): Objects remaining in bridge_fdb_cache on __kmem_cache_shutdown()
[ 1822.144821] -----------------------------------------------------------------------------

[ 1822.145990] Disabling lock debugging due to kernel taint
[ 1822.146732] INFO: Slab 0x0000000092a844b2 objects=32 used=2 fp=0x00000000fef011b0 flags=0x1ffff8000000100
[ 1822.147700] CPU: 2 PID: 13584 Comm: rmmod Tainted: G B O 4.15.0-rc2+ #87
[ 1822.148578] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[ 1822.150008] Call Trace:
[ 1822.150510] dump_stack+0x78/0xa9
[ 1822.151156] slab_err+0xb1/0xd3
[ 1822.151834] ? __kmalloc+0x1bb/0x1ce
[ 1822.152546] __kmem_cache_shutdown+0x151/0x28b
[ 1822.153395] shutdown_cache+0x13/0x144
[ 1822.154126] kmem_cache_destroy+0x1c0/0x1fb
[ 1822.154669] SyS_delete_module+0x194/0x244
[ 1822.155199] ? trace_hardirqs_on_thunk+0x1a/0x1c
[ 1822.155773] entry_SYSCALL_64_fastpath+0x23/0x9a
[ 1822.156343] RIP: 0033:0x7f929bd38b17
[ 1822.156859] RSP: 002b:00007ffd160e9a98 EFLAGS: 00000202 ORIG_RAX: 00000000000000b0
[ 1822.157728] RAX: ffffffffffffffda RBX: 00005578316ba090 RCX: 00007f929bd38b17
[ 1822.158422] RDX: 00007f929bd9ec60 RSI: 0000000000000800 RDI: 00005578316ba0f0
[ 1822.159114] RBP: 0000000000000003 R08: 00007f929bff5f20 R09: 00007ffd160e8a11
[ 1822.159808] R10: 00007ffd160e9860 R11: 0000000000000202 R12: 00007ffd160e8a80
[ 1822.160513] R13: 0000000000000000 R14: 0000000000000000 R15: 00005578316ba090
[ 1822.161278] INFO: Object 0x000000007645de29 @offset=0
[ 1822.161666] INFO: Object 0x00000000d5df2ab5 @offset=128