Re: [PATCH v2 net 6/6] net/sched: qdisc_destroy() old ingress and clsact Qdiscs before grafting

From: Peilin Ye
Date: Tue May 23 2023 - 16:06:35 EST


On Tue, May 23, 2023 at 08:36:35AM -0300, Pedro Tammela wrote:
> > Thanks for testing this, but the syzbot reproducer creates ingress Qdiscs
> > under TC_H_ROOT, which isn't covered by [6/6] i.e. it exercises the
> > "!ingress" path in qdisc_graft(). I think that's why you are still seeing
> > the oops. Adding sch_{ingress,clsact} to TC_H_ROOT is no longer possible
> > after [1,2/6], and I think we'll need a different reproducer for [5,6/6].
>
> I was still able to trigger an oops with the full patchset:
>
> [ 104.944353][ T6588] ------------[ cut here ]------------
> [ 104.944896][ T6588] jump label: negative count!
> [ 104.945780][ T6588] WARNING: CPU: 0 PID: 6588 at kernel/jump_label.c:263
> static_key_slow_try_dec+0xf2/0x110
> [ 104.946795][ T6588] Modules linked in:
> [ 104.947111][ T6588] CPU: 0 PID: 6588 Comm: repro Not tainted
> 6.4.0-rc2-00191-g4a3f9100193d #3
> [ 104.947765][ T6588] Hardware name: QEMU Standard PC (i440FX + PIIX,
> 1996), BIOS 1.16.0-debian-1.16.0-5 04/01/2014
> [ 104.948557][ T6588] RIP: 0010:static_key_slow_try_dec+0xf2/0x110
> [ 104.949064][ T6588] Code: d5 ff e8 c1 33 d5 ff 44 89 e8 5b 5d 41 5c 41 5d
> c3 44 89 e5 e9 66 ff ff ff e8 aa 33 d5 ff 48 c7 c7 00 9c 56 8a e8 4e ce 9c
> ff <0f> 0b eb ae 48 89 df e8 02 4b 28 00 e9 42 ff ff ff 66 66 2e 0f 1f
> [ 104.951134][ T6588] RSP: 0018:ffffc900079cf2c0 EFLAGS: 00010286
> [ 104.951646][ T6588] RAX: 0000000000000000 RBX: ffffffff9213a160 RCX:
> 0000000000000000
> [ 104.952269][ T6588] RDX: ffff888112f83b80 RSI: ffffffff814c7747 RDI:
> 0000000000000001
> [ 104.952901][ T6588] RBP: 00000000ffffffff R08: 0000000000000001 R09:
> 0000000000000000
> [ 104.953523][ T6588] R10: 0000000000000001 R11: 0000000000000001 R12:
> 00000000ffffffff
> [ 104.954133][ T6588] R13: ffff88816a514001 R14: 0000000000000001 R15:
> ffffffff8e7b0680
> [ 104.954746][ T6588] FS: 00007f76c65d56c0(0000) GS:ffff8881f5a00000(0000)
> knlGS:0000000000000000
> [ 104.955430][ T6588] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 104.955941][ T6588] CR2: 00007f9a40357a50 CR3: 000000011461e000 CR4:
> 0000000000350ef0
> [ 104.956559][ T6588] Call Trace:
> [ 104.956829][ T6588] <TASK>
> [ 104.957062][ T6588] ? clsact_egress_block_get+0x40/0x40
> [ 104.957507][ T6588] static_key_slow_dec+0x60/0xc0
> [ 104.957906][ T6588] qdisc_create+0xa45/0x1090
> [ 104.958274][ T6588] ? tc_get_qdisc+0xb70/0xb70
> [ 104.958646][ T6588] tc_modify_qdisc+0x491/0x1b70
> [ 104.959031][ T6588] ? qdisc_create+0x1090/0x1090
> [ 104.959420][ T6588] ? bpf_lsm_capable+0x9/0x10
> [ 104.959797][ T6588] ? qdisc_create+0x1090/0x1090

Ah, qdisc_create() calls ->destroy() even "if ops->init() failed". We
should check sch->parent in {ingress,clsact}_destroy() too. I'll update
[1,2/6] in v5.

Thanks for reporting this! Seems like I should've run the reproducer
nevertheless. I'll run it before posting v5.

Thanks,
Peilin Ye