Re: [PATCH net v2] net: dsa: Check existence of .port_mdb_add callback before calling it

From: Vivien Didelot
Date: Tue Aug 06 2019 - 13:15:18 EST


Hi Chen-Yu,

On Tue, 6 Aug 2019 15:53:25 +0800, Chen-Yu Tsai <wens@xxxxxxxxxx> wrote:
> From: Chen-Yu Tsai <wens@xxxxxxxx>
>
> With the recent addition of commit 75dad2520fc3 ("net: dsa: b53: Disable
> all ports on setup"), users of b53 (BCM53125 on Lamobo R1 in my case)
> are forced to use the dsa subsystem to enable the switch, instead of
> having it in the default transparent "forward-to-all" mode.
>
> The b53 driver does not support mdb bitmap functions. However the dsa
> layer does not check for the existence of the .port_mdb_add callback
> before actually using it. This results in a NULL pointer dereference,
> as shown in the kernel oops below.
>
> The other functions seem to be properly guarded. Do the same for
> .port_mdb_add in dsa_switch_mdb_add_bitmap() as well.
>
> b53 is not the only driver that doesn't support mdb bitmap functions.
> Others include bcm_sf2, dsa_loop, lantiq_gswip, mt7530, mv88e6060,
> qca8k, realtek-smi, and vitesse-vsc73xx.

I don't know what you mean by that, there's no "mdb bitmap function"
support for drivers, only the port_mdb_{prepare,add,del} callbacks...
>
> 8<--- cut here ---
> Unable to handle kernel NULL pointer dereference at virtual address 00000000
> pgd = (ptrval)
> [00000000] *pgd=00000000
> Internal error: Oops: 80000005 [#1] SMP ARM
> Modules linked in: rtl8xxxu rtl8192cu rtl_usb rtl8192c_common rtlwifi mac80211 cfg80211
> CPU: 1 PID: 134 Comm: kworker/1:2 Not tainted 5.3.0-rc1-00247-gd3519030752a #1
> Hardware name: Allwinner sun7i (A20) Family
> Workqueue: events switchdev_deferred_process_work
> PC is at 0x0
> LR is at dsa_switch_event+0x570/0x620
> pc : [<00000000>] lr : [<c08533ec>] psr: 80070013
> sp : ee871db8 ip : 00000000 fp : ee98d0a4
> r10: 0000000c r9 : 00000008 r8 : ee89f710
> r7 : ee98d040 r6 : ee98d088 r5 : c0f04c48 r4 : ee98d04c
> r3 : 00000000 r2 : ee89f710 r1 : 00000008 r0 : ee98d040
> Flags: Nzcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
> Control: 10c5387d Table: 6deb406a DAC: 00000051
> Process kworker/1:2 (pid: 134, stack limit = 0x(ptrval))
> Stack: (0xee871db8 to 0xee872000)
> 1da0: ee871e14 103ace2d
> 1dc0: 00000000 ffffffff 00000000 ee871e14 00000005 00000000 c08524a0 00000000
> 1de0: ffffe000 c014bdfc c0f04c48 ee871e98 c0f04c48 ee9e5000 c0851120 c014bef0
> 1e00: 00000000 b643aea2 ee9b4068 c08509a8 ee2bf940 ee89f710 ee871ecb 00000000
> 1e20: 00000008 103ace2d 00000000 c087e248 ee29c868 103ace2d 00000001 ffffffff
> 1e40: 00000000 ee871e98 00000006 00000000 c0fb2a50 c087e2d0 ffffffff c08523c4
> 1e60: ffffffff c014bdfc 00000006 c0fad2d0 ee871e98 ee89f710 00000000 c014c500
> 1e80: 00000000 ee89f3c0 c0f04c48 00000000 ee9e5000 c087dfb4 ee9e5000 00000000
> 1ea0: ee89f710 ee871ecb 00000001 103ace2d 00000000 c0f04c48 00000000 c087e0a8
> 1ec0: 00000000 efd9a3e0 0089f3c0 103ace2d ee89f700 ee89f710 ee9e5000 00000122
> 1ee0: 00000100 c087e130 ee89f700 c0fad2c8 c1003ef0 c087de4c 2e928000 c0fad2ec
> 1f00: c0fad2ec ee839580 ef7a62c0 ef7a9400 00000000 c087def8 c0fad2ec c01447dc
> 1f20: ef315640 ef7a62c0 00000008 ee839580 ee839594 ef7a62c0 00000008 c0f03d00
> 1f40: ef7a62d8 ef7a62c0 ffffe000 c0145b84 ffffe000 c0fb2420 c0bfaa8c 00000000
> 1f60: ffffe000 ee84b600 ee84b5c0 00000000 ee870000 ee839580 c0145b40 ef0e5ea4
> 1f80: ee84b61c c014a6f8 00000001 ee84b5c0 c014a5b0 00000000 00000000 00000000
> 1fa0: 00000000 00000000 00000000 c01010e8 00000000 00000000 00000000 00000000
> 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
> [<c08533ec>] (dsa_switch_event) from [<c014bdfc>] (notifier_call_chain+0x48/0x84)
> [<c014bdfc>] (notifier_call_chain) from [<c014bef0>] (raw_notifier_call_chain+0x18/0x20)
> [<c014bef0>] (raw_notifier_call_chain) from [<c08509a8>] (dsa_port_mdb_add+0x48/0x74)
> [<c08509a8>] (dsa_port_mdb_add) from [<c087e248>] (__switchdev_handle_port_obj_add+0x54/0xd4)
> [<c087e248>] (__switchdev_handle_port_obj_add) from [<c087e2d0>] (switchdev_handle_port_obj_add+0x8/0x14)
> [<c087e2d0>] (switchdev_handle_port_obj_add) from [<c08523c4>] (dsa_slave_switchdev_blocking_event+0x94/0xa4)
> [<c08523c4>] (dsa_slave_switchdev_blocking_event) from [<c014bdfc>] (notifier_call_chain+0x48/0x84)
> [<c014bdfc>] (notifier_call_chain) from [<c014c500>] (blocking_notifier_call_chain+0x50/0x68)
> [<c014c500>] (blocking_notifier_call_chain) from [<c087dfb4>] (switchdev_port_obj_notify+0x44/0xa8)
> [<c087dfb4>] (switchdev_port_obj_notify) from [<c087e0a8>] (switchdev_port_obj_add_now+0x90/0x104)
> [<c087e0a8>] (switchdev_port_obj_add_now) from [<c087e130>] (switchdev_port_obj_add_deferred+0x14/0x5c)
> [<c087e130>] (switchdev_port_obj_add_deferred) from [<c087de4c>] (switchdev_deferred_process+0x64/0x104)
> [<c087de4c>] (switchdev_deferred_process) from [<c087def8>] (switchdev_deferred_process_work+0xc/0x14)
> [<c087def8>] (switchdev_deferred_process_work) from [<c01447dc>] (process_one_work+0x218/0x50c)
> [<c01447dc>] (process_one_work) from [<c0145b84>] (worker_thread+0x44/0x5bc)
> [<c0145b84>] (worker_thread) from [<c014a6f8>] (kthread+0x148/0x150)
> [<c014a6f8>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
> Exception stack(0xee871fb0 to 0xee871ff8)
> 1fa0: 00000000 00000000 00000000 00000000
> 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000
> Code: bad PC value
> ---[ end trace 1292c61abd17b130 ]---
>
> [<c08533ec>] (dsa_switch_event) from [<c014bdfc>] (notifier_call_chain+0x48/0x84)
> corresponds to
>
> $ arm-linux-gnueabihf-addr2line -C -i -e vmlinux c08533ec
>
> linux/net/dsa/switch.c:156
> linux/net/dsa/switch.c:178
> linux/net/dsa/switch.c:328
>
> Fixes: e6db98db8a95 ("net: dsa: add switch mdb bitmap functions")
> Signed-off-by: Chen-Yu Tsai <wens@xxxxxxxx>
> ---
> Changes since v1:
>
> - Moved the check to the beginning of dsa_switch_mdb_add()
>
> Looks like we could also move the ops check out of
> dsa_switch_mdb_prepare_bitmap(), though I suppose keeping the code the
> way it is now is clearer.
>
> ---
> net/dsa/switch.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/net/dsa/switch.c b/net/dsa/switch.c
> index 4ec5b7f85d51..231af5268656 100644
> --- a/net/dsa/switch.c
> +++ b/net/dsa/switch.c
> @@ -164,6 +164,9 @@ static int dsa_switch_mdb_add(struct dsa_switch *ds,
> struct switchdev_trans *trans = info->trans;
> int port;
>
> + if (!ds->ops->port_mdb_add)
> + return -EOPNOTSUPP;
> +
> /* Build a mask of Multicast group members */
> bitmap_zero(ds->bitmap, ds->num_ports);
> if (ds->index == info->sw_index)
> --
> 2.20.1
>

I don't understand the crash here, nor the fix. dsa_switch_mdb_add()
is supposed to be called through switchdev with a prepare phase,
which checks for ds->ops->port_mdb_add. Do you mean that a switchdev
MDB object is added somewhere without a prepare phase? If that's
the case, this is what the commit message must say. Then the
ds->ops->port_mdb_add check must go where it is used, that is to say
at the beginning of dsa_switch_mdb_add_bitmap() (similarly to what
dsa_switch_mdb_prepare_bitmap() does), not in dsa_switch_mdb_add.


Thanks,

Vivien