Re: [PATCH net 0/4] net: qede: avoid overruling error codes

From: Simon Horman
Date: Sat Apr 27 2024 - 13:36:35 EST


On Sat, Apr 27, 2024 at 12:58:38PM +0000, Asbjørn Sloth Tønnesen wrote:
> Hi Simon,
>
> Thank you for your review effort.
>
> On 4/27/24 11:48 AM, Simon Horman wrote:
> > On Fri, Apr 26, 2024 at 09:12:22AM +0000, Asbjørn Sloth Tønnesen wrote:
> > > This series fixes the qede driver, so that
> > > qede_parse_flow_attr() and it's subfunctions
> > > doesn't get their error codes overruled
> > > (ie. turning -EOPNOTSUPP into -EINVAL).
> > >
> > > ---
> > > I have two more patches along the same lines,
> > > but they are not yet causing any issues,
> > > so I have them destined for net-next.
> > > (those are for qede_flow_spec_validate_unused()
> > > and qede_flow_parse_ports().)
> > >
> > > After that I have a series for converting to
> > > extack + the final one for validating control
> > > flags.
> >
> > Hi,
> >
> > I'm fine with these patches so far as the code changes go.
> > But it is not clear to me that they are fixing a bug.
> >
> > If so, I think some explanation should go in the commit messages.
> > If not, I think these should be targeted at net-next
> > (and not have Fixes tags.
>
> Since I don't have the hardware I didn't try to construct commands, showing
> the wrong error code being returned. I could make up some hypothetical commands,
> and simulate how they would error. I assumed that the bug, was clear based on
> the list of possible return values for each function.
>
> As an example, in qede_parse_flow_attr() it validates dissector->used_keys,
> and if an unsupported FLOW_DISSECTOR_KEY_* is set, then ede_parse_flow_attr()
> returns -EOPNOTSUPP, which is returned to qede_add_tc_flower_fltr(),
> and only check for non-zero, and since -EOPNOTSUPP is non zero,
> then it returns -EINVAL. So if you try to match on a vlan tag,
> then FLOW_DISSECTOR_KEY_VLAN would be set, and cause a -EOPNOTSUPP
> to be returned, which then gets converted into a -EINVAL.
>
> All drivers generally returns -EOPNOTSUPP in their used_keys checks, and
> this driver clearly intended to do that as well.
>
> The -EINVAL override was introduced in the same commit as the above check,
> so it was broken from the start.
>
> Another example is 319a1d19471e (blamed in 4th patch), Jiri added
> a call to flow_action_basic_hw_stats_types_check() across multiple drivers,
> and since -EINVAL was returned only a few lines above, then he assumed
> that he could just return -EOPNOTSUPP, but that return value gets overruled
> into a -EINVAL. It is clear from the commit that Jiri intended to return
> -EOPNOTSUPP, but this part of the driver didn't follow the principle of
> least astonishment, so that function could only fail with -EINVAL.
>
> I think it's a bug, when another error code is returned than the one that
> was clearly intended, but it's properly a low impact one.

Thanks, now that you point this out I agree this should have been obvious
to me.

I agree that the patches resolve issues around -EOPNOTSUPP (and other error
values; that these errors are, in general, propagated to user-space; and
that especially in the case of -EOPNOTSUPP, this may effect the behaviour
of the system as it is intended to indicate that offload of an action is not
supported (at this time, for any reason).

> > Also, if you do end posting a v2, blamed, is misspelt several
> > times in commit messages.
>
> Sorry about that, will fix that if a v2 turns out to be needed.
>
> --
> Best regards
> Asbjørn Sloth Tønnesen
> Network Engineer
> Fiberby - AS42541
>