Re: [syzbot] [can?] KCSAN: data-race in can_send / can_send (5)

From: Oliver Hartkopp
Date: Mon Mar 10 2025 - 10:40:53 EST


Hi Vincent, Marc,

I sent a patch to be reviewed:
https://lore.kernel.org/linux-can/20250310143353.3242-1-socketcan@xxxxxxxxxxxx/T/#u

I've also tested this patch without any new issues.

Best regards,
Oliver

On 10.03.25 10:55, Vincent Mailhol wrote:
On Mon. 10 Mar 2025 at 18:46, Oliver Hartkopp <socketcan@xxxxxxxxxxxx> wrote:
On 10.03.25 10:29, Vincent Mailhol wrote:
On Mon. 10 Mar 2025 at 03:59, Oliver Hartkopp <socketcan@xxxxxxxxxxxx> wrote:

(...)

Isn't there some lock-less per-cpu safe statistic handling within netdev
we might pick for our use-case?

I see two solutions. Either we use lock_sock(skb->sk) and
release_sock(skb->sk) or we can change the types of
can_pkg_stats->tx_frames and can_pkg_stats->tx_frames_delta from long
to atomic_long_t.

The atomic_long_t is the closest solution to a lock-less. But my
preference goes to the lock_sock() which looks more natural in this
context. And look_sock() is just a spinlock which under the hood is
also an atomic, so no big penalty either.

When we get skbs from the netdevice (and not from user space), we do not
have a valid sk value. It is set to zero.

See:
https://elixir.bootlin.com/linux/v6.13.6/source/net/can/raw.c#L203

And those skbs can also be forwarded by can-gw using can_send().

Therefore there is no lock_sock() without a valid sk ;-)

When 'atomic_long_t' would also fix this simple statistics handling, we
should use that.

I see, Thanks for the explanation. Then atomic_long_t seems the best
(and easiest).