Re: PATCH: VLAN support for 3c59x/3c90x

From: Jeff Garzik
Date: Sat Jul 31 2004 - 11:20:39 EST


Matti Aarnio wrote:
On Sat, Jul 31, 2004 at 12:11:52PM +0200, Willy Tarreau wrote:
Ok, sorry, I've just checked, they are 6. But I incidentely used the feature
on 2 of them (dl2k and starfire). But more drivers still have the
'static int mtu=1500' preceeded by a comment stating "allow the user to change
the mtu". Why is it not a #define then, if nobody can change it anymore ?


In the older kernels that allowed for module parameter loading
sufficiently, I recall. Now couple additional macroes are needed
to publish such parameters. APIs do change in Linux kernel.

I have been pondering on the issue of usefullness of ->change_mtu
for this use. One of the bigger issues is, like Willy notes, that
the MTU change information is given to the driver after it is already
up and about, which requires then running setup magics which usually
need running reset...

First, MTU change need not occur while the interface is up.

Second, modern hardware deals a lot better with MTU changes. Some only need a write into a register. Some need no reset at all, as long as you don't exceed the hardware limit.


Also the Linux kernel isn't very well happy with multi-path stacking
of layer-2 driver modules. A side-effect from all of these things might

Elaboration? The 2.6.x net drivers do proper refcounting, unlike a lot of other drivers.


To prevent that from happening, it is sufficient in the eth driver to
not to shrink its MTU except via card shutdown -- but then, IFCONFIG
data for e.g. IP layer needs separation from the hardware driver layer.

In general ifconfig data should definitely -not- be separate from the driver. In particular changing MTU definitely needs to be tightly integrated with the driver.


For this IFCONFIG MTU issue, I would rather have the VLAN code to ask
the underlaying driver of what MTU can be supported, than just blindly
presume that 1500 will be functional for e.g. eth0.2 (like it does now)

The VLAN code could certainly be updated to poke at the lower level driver MTU.


For VLAN support you definitely want to let the user increase the size above 1500, and for that you need ->change_mtu

I agree, but my point was that adding MODULE_PARM was only a one liner and
would have done the job too. But since everyone prefers a change_mtu(), I'll
do it.

Jeff, do you know the absolute hardware limit on the tulip ? I've seen the
limitation to PKT_BUF_SZ (1536), but I don't know for example if the
hardware stores the FCS in the buffer or not, nor if the IP headers risk
being aligned or not (which would consume 2 more bytes).
Or does 1536 - 14 (ethernet) - 2 (iphdr alignment) - 4 (FCS) = 1516 seem a
reasonable conservative higher bound ?


The Tulip (21143 at least) can do chained block receive; if first memory
block is too short, it can continue to next one. This way maximum frame

Yes, but receiving packets not wholly contained in a single frame is SO sub-optimal that it is to be avoided at all costs.

Maybe when receive scatter-gather is fully supported this can change, but for now the driver should not be returning multi-frag frames to the kernel.


size is at least 2560 bytes. For transmit the Jabber timer seems to
trip at 2560, including preamples and crcs. Also, there is a receive watchdog, that is guaranteed to pass 2048 byte frames, and timeout at
2560 byte frames. (When the watchdog is not disabled, that is.) See CSR15<4>. For transmit the Jabber-Clock bites at 2048-2560 bytes,
OR at timer of 2.6-3.3 ms (of 100 Mbps) which means at least 32 000 bytes.
( CSR15<2> )

In the receive descriptors there might appear a TL bit (Frame Too Long),
which is just telling that frame size exceeds 1518 bytes.
If RW (Receive Watchdog; RDES0<4>) has tripped, then there is at least
2048 bytes long frame, most likely longer than 2560 bytes.

Based on my reading of ds21143hrm.pdf (copy of which I have), I do
think it is safe to just receive larger frames with Tulip, and IGNORE
the "TL" bit.

That covers one of seven or eight tulip chips driven by the driver.

Once you exceed the ethernet norm there are tons of chip-specific quirks and details to deal with. In addition to the details you mention, the on-chip FIFO sizes and behaviors become important. As does the multi-fragment frame issue. Some chips with checksumming features only work when the MTU is less than an unknown magic number (less than you would think, but higher than 1500).

All these reasons are why I want to dive into the 3c59x documentation, and also do some testing on older models, before we merge Alan's patch from $subject.

Jeff


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/