Re: [PATCH v5 net-next 0/3] use bulk reads for ocelot statistics

From: Colin Foster
Date: Tue Feb 08 2022 - 08:55:46 EST


On Tue, Feb 08, 2022 at 01:30:16PM +0000, Vladimir Oltean wrote:
> On Mon, Feb 07, 2022 at 08:46:41PM -0800, Colin Foster wrote:
> > Ocelot loops over memory regions to gather stats on different ports.
> > These regions are mostly continuous, and are ordered. This patch set
> > uses that information to break the stats reads into regions that can get
> > read in bulk.
> >
> > The motiviation is for general cleanup, but also for SPI. Performing two
> > back-to-back reads on a SPI bus require toggling the CS line, holding,
> > re-toggling the CS line, sending 3 address bytes, sending N padding
> > bytes, then actually performing the read. Bulk reads could reduce almost
> > all of that overhead, but require that the reads are performed via
> > regmap_bulk_read.
> >
> > v1 > v2: reword commit messages
> > v2 > v3: correctly mark this for net-next when sending
> > v3 > v4: calloc array instead of zalloc per review
> > v4 > v5:
> > Apply CR suggestions for whitespace
> > Fix calloc / zalloc mixup
> > Properly destroy workqueues
> > Add third commit to split long macros
> >
> >
> > Colin Foster (3):
> > net: ocelot: align macros for consistency
> > net: mscc: ocelot: add ability to perform bulk reads
> > net: mscc: ocelot: use bulk reads for stats
> >
> > drivers/net/ethernet/mscc/ocelot.c | 78 ++++++++++++++++++++++-----
> > drivers/net/ethernet/mscc/ocelot_io.c | 13 +++++
> > include/soc/mscc/ocelot.h | 57 ++++++++++++++------
> > 3 files changed, 120 insertions(+), 28 deletions(-)
> >
> > --
> > 2.25.1
> >
>
> Please do not merge these yet. I gave them a run on my board and the
> kernel crashed on boot.
>
> [ 8.043170] mscc_felix 0000:00:00.5: Found PCS at internal MDIO address 0
> [ 8.050241] mscc_felix 0000:00:00.5: Found PCS at internal MDIO address 1
> [ 8.057142] mscc_felix 0000:00:00.5: Found PCS at internal MDIO address 2
> [ 8.064021] mscc_felix 0000:00:00.5: Found PCS at internal MDIO address 3
> [ 8.128668] ------------[ cut here ]------------
> [ 8.133315] WARNING: CPU: 1 PID: 44 at drivers/net/dsa/ocelot/felix_vsc9959.c:1007 vsc9959_wm_enc+0x2c/0x40
> [ 8.143107] Modules linked in:
> [ 8.146181] CPU: 1 PID: 44 Comm: kworker/u4:2 Not tainted 5.17.0-rc2-07010-ga9b9500ffaac-dirty #2104
> [ 8.155355] Hardware name: LS1028A RDB Board (DT)
> [ 8.160079] Workqueue: events_unbound deferred_probe_work_func
> [ 8.165945] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 8.172940] pc : vsc9959_wm_enc+0x2c/0x40
> [ 8.176967] lr : ocelot_setup_sharing_watermarks+0x94/0x1fc
> [ 8.182568] sp : ffff800008863810
> [ 8.185896] x29: ffff800008863810 x28: 0000000000000308 x27: 0000000000000001
> [ 8.193079] x26: 000000000300001a x25: 0000000000000008 x24: 0000000000000080
> [ 8.200261] x23: 0000000088888889 x22: 0000000000000000 x21: 0000000000000000
> [ 8.207441] x20: 00000000ffff2d17 x19: ffff2d17039d8010 x18: ffffd8e2afa23b28
> [ 8.214623] x17: 0000000000000015 x16: 0000000000000041 x15: 0000000000000000
> [ 8.221803] x14: ffffd8e2afa49228 x13: 0000000000025700 x12: 0000000000000009
> [ 8.228984] x11: ffff2d1703a96c18 x10: 0000000000000004 x9 : ffffd8e2ad6ce0f8
> [ 8.236165] x8 : ffff2d1700be4240 x7 : ffffd8e2af981000 x6 : 0000000000000001
> [ 8.243345] x5 : ffffd8e2ad1e1440 x4 : 0000000000000000 x3 : 0000000000000000
> [ 8.250525] x2 : 0000000000000000 x1 : ffffd8e2ad3d2810 x0 : 00000000000040c0
> [ 8.257706] Call trace:
> [ 8.260162] vsc9959_wm_enc+0x2c/0x40
> [ 8.263841] ocelot_devlink_sb_register+0x33c/0x380
> [ 8.268742] felix_setup+0x438/0x750
> [ 8.272334] dsa_register_switch+0x988/0x114c
> [ 8.276713] felix_pci_probe+0x138/0x1fc
> [ 8.280654] local_pci_probe+0x4c/0xc0
> [ 8.284423] pci_device_probe+0x1b0/0x1f0
> [ 8.288451] really_probe.part.0+0xa4/0x310
> [ 8.292654] __driver_probe_device+0xa0/0x150
> [ 8.297030] driver_probe_device+0xcc/0x164
> [ 8.301231] __device_attach_driver+0xc4/0x130
> [ 8.305695] bus_for_each_drv+0x84/0xe0
> [ 8.309547] __device_attach+0xe4/0x190
> [ 8.313400] device_initial_probe+0x20/0x30
> [ 8.317601] bus_probe_device+0xac/0xb4
> [ 8.321454] deferred_probe_work_func+0x98/0xd4
> [ 8.326004] process_one_work+0x294/0x700
> [ 8.330037] worker_thread+0x80/0x480
> [ 8.333717] kthread+0x10c/0x120
> [ 8.336961] ret_from_fork+0x10/0x20
> [ 8.340554] irq event stamp: 50432
> [ 8.343968] hardirqs last enabled at (50431): [<ffffd8e2ade442b0>] _raw_spin_unlock_irqrestore+0x90/0xb0
> [ 8.353581] hardirqs last disabled at (50432): [<ffffd8e2ade36e44>] el1_dbg+0x24/0x90
> [ 8.361448] softirqs last enabled at (50148): [<ffffd8e2ac6908f0>] __do_softirq+0x480/0x5f8
> [ 8.369923] softirqs last disabled at (50143): [<ffffd8e2ac728e3c>] __irq_exit_rcu+0x17c/0x1b0
> [ 8.378577] ---[ end trace 0000000000000000 ]---
> [ 8.383304] ------------[ cut here ]------------
> [ 8.387942] WARNING: CPU: 1 PID: 44 at drivers/net/dsa/ocelot/felix_vsc9959.c:1007 vsc9959_wm_enc+0x2c/0x40
> [ 8.397729] Modules linked in:
> [ 8.400800] CPU: 1 PID: 44 Comm: kworker/u4:2 Tainted: G W 5.17.0-rc2-07010-ga9b9500ffaac-dirty #2104
> [ 8.411369] Hardware name: LS1028A RDB Board (DT)
> [ 8.416092] Workqueue: events_unbound deferred_probe_work_func
> [ 8.421955] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 8.428948] pc : vsc9959_wm_enc+0x2c/0x40
> [ 8.432975] lr : ocelot_setup_sharing_watermarks+0xc0/0x1fc
> [ 8.438573] sp : ffff800008863810
> [ 8.441900] x29: ffff800008863810 x28: 0000000000000308 x27: 0000000000000001
> [ 8.449081] x26: 000000000300001a x25: 0000000000000008 x24: 0000000000000080
> [ 8.456262] x23: 0000000088888889 x22: 0000000000000000 x21: 0000000000000000
> [ 8.463443] x20: 00000000ffff2d17 x19: ffff2d17039d8010 x18: ffffd8e2afa23b28
> [ 8.470623] x17: 0000000000000015 x16: 0000000000000041 x15: 0000000000000000
> [ 8.477804] x14: ffffd8e2afa49228 x13: 0000000000025700 x12: 0000000000000009
> [ 8.484984] x11: ffff2d1703a96c18 x10: 0000000000000004 x9 : ffffd8e2ad6ce124
> [ 8.492165] x8 : ffff2d1700be4240 x7 : ffffd8e2af981000 x6 : 0000000000000001
> [ 8.499345] x5 : ffffd8e2ad1e1440 x4 : 0000000000000000 x3 : 0000000000000001
> [ 8.506525] x2 : 0000000000000000 x1 : ffffd8e2ad3d2810 x0 : 00000000000040c0
> [ 8.513705] Call trace:
> [ 8.516161] vsc9959_wm_enc+0x2c/0x40
> [ 8.519840] ocelot_devlink_sb_register+0x33c/0x380
> [ 8.524740] felix_setup+0x438/0x750
> [ 8.528331] dsa_register_switch+0x988/0x114c
> [ 8.532708] felix_pci_probe+0x138/0x1fc
> [ 8.536648] local_pci_probe+0x4c/0xc0
> [ 8.540415] pci_device_probe+0x1b0/0x1f0
> [ 8.544443] really_probe.part.0+0xa4/0x310
> [ 8.548646] __driver_probe_device+0xa0/0x150
> [ 8.553022] driver_probe_device+0xcc/0x164
> [ 8.557225] __device_attach_driver+0xc4/0x130
> [ 8.561688] bus_for_each_drv+0x84/0xe0
> [ 8.565540] __device_attach+0xe4/0x190
> [ 8.569393] device_initial_probe+0x20/0x30
> [ 8.573594] bus_probe_device+0xac/0xb4
> [ 8.577447] deferred_probe_work_func+0x98/0xd4
> [ 8.581997] process_one_work+0x294/0x700
> [ 8.586026] worker_thread+0x80/0x480
> [ 8.589706] kthread+0x10c/0x120
> [ 8.592949] ret_from_fork+0x10/0x20
> [ 8.596541] irq event stamp: 50450
> [ 8.599955] hardirqs last enabled at (50449): [<ffffd8e2ade442b0>] _raw_spin_unlock_irqrestore+0x90/0xb0
> [ 8.609566] hardirqs last disabled at (50450): [<ffffd8e2ade36e44>] el1_dbg+0x24/0x90
> [ 8.617431] softirqs last enabled at (50446): [<ffffd8e2ac6908f0>] __do_softirq+0x480/0x5f8
> [ 8.625907] softirqs last disabled at (50435): [<ffffd8e2ac728e3c>] __irq_exit_rcu+0x17c/0x1b0
> [ 8.634559] ---[ end trace 0000000000000000 ]---
> [ 8.640586] device eth1 entered promiscuous mode
> [ 8.645499] Unable to handle kernel paging request at virtual address 00000010400020bc
> [ 8.653496] Mem abort info:
> [ 8.656340] ESR = 0x96000044
> [ 8.659413] EC = 0x25: DABT (current EL), IL = 32 bits
> [ 8.664784] SET = 0, FnV = 0
> [ 8.667855] EA = 0, S1PTW = 0
> [ 8.671044] FSC = 0x04: level 0 translation fault
> [ 8.675979] Data abort info:
> [ 8.678875] ISV = 0, ISS = 0x00000044
> [ 8.682762] CM = 0, WnR = 1
> [ 8.685795] [00000010400020bc] user address but active_mm is swapper
> [ 8.692272] Internal error: Oops: 96000044 [#1] PREEMPT SMP
> [ 8.697865] Modules linked in:
> [ 8.700928] CPU: 1 PID: 44 Comm: kworker/u4:2 Tainted: G W 5.17.0-rc2-07010-ga9b9500ffaac-dirty #2104
> [ 8.711490] Hardware name: LS1028A RDB Board (DT)
> [ 8.716206] Workqueue: events_unbound deferred_probe_work_func
> [ 8.722065] pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 8.729051] pc : ocelot_phylink_mac_link_down+0x70/0x314
> [ 8.734381] lr : felix_phylink_mac_link_down+0x24/0x30
> [ 8.739536] sp : ffff800008863840
> [ 8.742856] x29: ffff800008863840 x28: 0000000000000000 x27: ffffd8e2af8ef180
> [ 8.750022] x26: 0000000000000001 x25: 0000001040002000 x24: 0000000000000000
> [ 8.757187] x23: 0000000000000180 x22: ffff2d1703a92000 x21: 0000000000000004
> [ 8.764352] x20: 0000000000000004 x19: ffff2d17039d8010 x18: ffffd8e2afa23b28
> [ 8.771516] x17: 0000000000040006 x16: 0000000500030008 x15: ffffffffffffffff
> [ 8.778680] x14: ffffffffffff0000 x13: ffffffffffffffff x12: 0000000000000010
> [ 8.785845] x11: 0101010101010101 x10: 0000000000000004 x9 : ffffd8e2ad3d0814
> [ 8.793009] x8 : fefefefefeff6a6d x7 : 0000ffffffffffff x6 : 0000000000000000
> [ 8.800173] x5 : 00000000ffffffff x4 : 0000000000000001 x3 : 000000000d000007
> [ 8.807338] x2 : 0000000000000010 x1 : 0000000000000000 x0 : 0000001040002000
> [ 8.814502] Call trace:
> [ 8.816949] ocelot_phylink_mac_link_down+0x70/0x314
> [ 8.821929] felix_phylink_mac_link_down+0x24/0x30
> [ 8.826734] dsa_port_link_register_of+0xa8/0x240
> [ 8.831454] dsa_port_setup+0xb4/0x180
> [ 8.835212] dsa_register_switch+0xdb4/0x114c
> [ 8.839581] felix_pci_probe+0x138/0x1fc
> [ 8.843515] local_pci_probe+0x4c/0xc0
> [ 8.847275] pci_device_probe+0x1b0/0x1f0
> [ 8.851296] really_probe.part.0+0xa4/0x310
> [ 8.855490] __driver_probe_device+0xa0/0x150
> [ 8.859860] driver_probe_device+0xcc/0x164
> [ 8.864054] __device_attach_driver+0xc4/0x130
> [ 8.868510] bus_for_each_drv+0x84/0xe0
> [ 8.872355] __device_attach+0xe4/0x190
> [ 8.876200] device_initial_probe+0x20/0x30
> [ 8.880395] bus_probe_device+0xac/0xb4
> [ 8.884240] deferred_probe_work_func+0x98/0xd4
> [ 8.888783] process_one_work+0x294/0x700
> [ 8.892808] worker_thread+0x80/0x480
> [ 8.896480] kthread+0x10c/0x120
> [ 8.899715] ret_from_fork+0x10/0x20
> [ 8.903303] Code: 52800202 f874d839 52800001 aa1903e0 (b900bf25)
> [ 8.909417] ---[ end trace 0000000000000000 ]---
>
> Investigating...

I just did a sanity check on my latest tree and it doesn't crash. I'll
keep an eye out here as well.

Gives me an opportunity to fix up your other suggestion.

Thanks for reviewing and testing Vladimir!