Re: [PATCH net 1/2] net/mlx5e: psp: Fix invalid access on PSP dev registration fail
From: Cosmin Ratiu
Date: Tue Apr 21 2026 - 10:41:22 EST
On Tue, 2026-04-21 at 07:26 -0700, Jakub Kicinski wrote:
> On Tue, 21 Apr 2026 12:29:13 +0000 Cosmin Ratiu wrote:
> > > Sure but why are you leaving the priv->psp struct in place and
> > > whatever
> > > FS init has been done? IOW if you really want PSP init to not
> > > block
> > > probe why is mlx5e_psp_register() a void function rather than
> > > mlx5e_psp_init() ? Ignoring errors from psp_dev_create()
> > > makes no sense to me - what are you protecting from?
> > > kmalloc(GFP_KERNEL)
> > > failing?
> >
> > priv->psp and steering at the time of mlx5e_psp_register() is inert
> > without the PSP device. Cleaning it on psp_dev_create() failure
> > would
> > be weird, it's cleaned up anyway on netdev teardown. The fact that
> > only
> > memory allocations can fail inside psp_dev_create() is irrelevant
> > here.
> > psp_dev_create() failing shouldn't bring down the whole netdevice,
> > so
> > logging a message and continuing is ok (which is what is also done
> > for
> > macsec and ktls).
>
> This is a misguided cargo cult. Or something motivated by OOT
> compatibility. Alex D sometimes tries to do the same thing with Meta
> drivers. I don't get it. Of course we want the device to be
> operational
> if some *device* init fails. The compatibility matrix with all device
> generations and fw versions could justify that. But continuing init
> when a single-page kmalloc failed is pure silliness.
I am not sure about the wider context, but from the POV of the driver,
it's calling $thing from the kernel which can fail and it needs to do
something about it, either fail the entire netdev bringup or accept
that $thing won't be functional and continue without it. The driver
shouldn't need to know what $thing does inside and how it can fail,
which can change over time. Today it's a kmalloc(), tomorrow it may be
something else. It doesn't and shouldn't matter for the local decision
to continue or not without $thing working.
Isn't this reasonable?
>
> > mlx5e_psp_register() is void because it's called from
> > mlx5e_nic_enable() which can't fail, so it really can't do much
> > other
> > than complain to dmesg.
> >
> > But while thinking about this, I suppose we could change the entire
> > PSP
> > initialization to happen at the time of the current
> > mlx5e_psp_register(), and that would simplify the number of states.
> > I will do that in the next planned PSP series for net-next.
> >
> > Meanwhile, could you please take the 2nd patch and leave this one
> > out?
> > It should apply with no conflicts by itself.
> >
> > Or you would like to see a separate submission with the 2nd patch
> > alone?
>
> Please resubmit.