Re: [PATCH 1/7] dsa: marvell: Provide per device information about max frame size

From: Lukasz Majewski
Date: Fri Mar 10 2023 - 09:12:14 EST


Hi Vladimir,

> On Fri, Mar 10, 2023 at 02:17:19PM +0100, Lukasz Majewski wrote:
> > > > For example mv88e6185 supports max 1632 bytes, which is now
> > > > in-driver standard value.
> > >
> > > What is the criterion based on which 1632 is the "in-driver
> > > standard value"?
> >
> > It comes from the documentation I suppose. Moreover, this might be
> > the the "first" used value when set_max_mtu callback was
> > introduced.
>
> I'm not playing dumb, I just don't understand what is meant by
> "in-driver standard value". Is it the return value of
> mv88e6xxx_get_max_mtu() for the MV88E6185 chip?

The 1632 is a value from added early switch IC to this driver.

Then the get_max_mtu function was extended to support jumbo frames.

And the extension was based on having the .set_max_frame_size defined
in *_ops structure.

> Because I understood
> it to be somehow the value returned by default, for chips which don't
> have a way to change the MTU (because of the "standard" word).
>

Probably the "standard" shall be replaced above - it might be
misleading. "Default" would be better.

> > > > On the other hand - mv88e6250 supports 2048 bytes.
> > >
> > > What you mean to suggest here is that, using the current
> > > classification from mv88e6xxx_get_max_mtu(), mv88e6250 falls into
> > > the "none of the above" bucket, for which the driver returns 1522
> > > - VLAN_ETH_HLEN - EDSA_HLEN - ETH_FCS_LEN // 1492. But it truly
> > > supports a maximum frame length of 2048, per your research.
> > >
> >
> > And this cannot be easily fixed.
> >
> > I could just provide callback to .set_max_frame_size in
> > mv88e6250_ops and the mv88e6250 would use 1632 bytes which would
> > prevent errors.
> >
> > However, when I dig deeper - it turned out that this value is
> > larger. And hence I wanted to do it in "right way".
>
> Correct, I'm not debating this. I'm just saying, as a reader of this
> patch set in linear order, that the justification is not obvious.
>
> > > I have also noticed that you have not acted upon my previous
> > > review comment:
> > > https://patchwork.kernel.org/project/netdevbpf/patch/20230106101651.1137755-1-lukma@xxxxxxx/
> > >
> > > | 1522 - 30 = 1492.
> > > |
> > > | I don't believe that there are switches which don't support the
> > > standard | MTU of 1500 ?!
> > > |
> > > | > .port_base_addr = 0x10,
> > > | > .phy_base_addr = 0x0,
> > > | > .global1_addr = 0x1b,
> > > |
> > > | Note that I see this behavior isn't new. But I've simulated it,
> > > and it | will produce the following messages on probe:
> > > |
> > > | [ 7.425752] mscc_felix 0000:00:00.5 swp0 (uninitialized): PHY
> > > [0000:00:00.3:10] driver [Microsemi GE VSC8514 SyncE] (irq=POLL)
> > > | [ 7.437516] mscc_felix 0000:00:00.5: nonfatal error -34 setting
> > > MTU to 1500 on port 0 | [ 7.588585] mscc_felix 0000:00:00.5
> > > swp1 (uninitialized): PHY [0000:00:00.3:11] driver [Microsemi GE
> > > VSC8514 SyncE] (irq=POLL) | [ 7.600433] mscc_felix
> > > 0000:00:00.5: nonfatal error -34 setting MTU to 1500 on port 1 |
> > > [ 7.752613] mscc_felix 0000:00:00.5 swp2 (uninitialized): PHY
> > > [0000:00:00.3:12] driver [Microsemi GE VSC8514 SyncE] (irq=POLL)
> > > | [ 7.764457] mscc_felix 0000:00:00.5: nonfatal error -34
> > > setting MTU to 1500 on port 2 | [ 7.900771] mscc_felix
> > > 0000:00:00.5 swp3 (uninitialized): PHY [0000:00:00.3:13] driver
> > > [Microsemi GE VSC8514 SyncE] (irq=POLL) | [ 7.912501] mscc_felix
> > > 0000:00:00.5: nonfatal error -34 setting MTU to 1500 on port 3 |
> > > | I wonder, shouldn't we first fix that, and apply this patch set
> > > afterwards?
> > >
> > > I guess I will have to fix this now, since you haven't done it.
> >
> > I do agree with Russel's reply here.
>
> It's possible that Russell might have slightly misunderstood my quoted
> reply here, because he said something about a PHY.
>
> > Moreover, as 6250 and 6220 also have max frame size equal to 2048
> > bytes, this would be fixed in v6 after getting the "validation"
> > function run.
>
> The problem with this kind of fix is that it should go to the "net"
> tree; it removes a user-visible warning that could have been avoided.
>
> OTOH, the kind of "fix" for 6250 and 6220 is different. It is
> sub-optimal use of hardware capabilities. That classifies as net-next
> material.
>
> The 2 go to different kernel branches.
>

As I said - v6 fixes it in the way which Russel proposed. I also do
like this approach, so to avoid "wasting effort" I would opt for having
this fix in this patchset.

(And I hope that we will finish work on it before MW closes).

> > This is the "patch 4" in the comment sent by Russel (to fix stuff
> > which is already broken, but it has been visible after running the
> > validation code):
> >
> > https://lists.openwall.net/netdev/2023/03/09/233
>
> I will get there.. eventually.




Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH, Managing Director: Erika Unter
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@xxxxxxx

Attachment: pgprbmoLnMBYv.pgp
Description: OpenPGP digital signature