Re: kernel > 2.1.36 & nfs

Linus Torvalds (torvalds@transmeta.com)
Tue, 3 Jun 1997 16:04:11 -0700 (PDT)


On Tue, 3 Jun 1997, Alan Cox wrote:
> > The point is that the copy is _never_ needed - we'd be better off just
> > leaving the packets fragmented, and let the higher level protocols (tcp
> > and udp) take care of "reassembly" (ie copy them to user mode or to the
> > page cache and only _then_ do we make the DATA contiguous - never the
> > actual packet itself).
>
> Second point. We can't make a 65280 byte HIPPI NFS UDP packet contiguous
> without fixing the underlying page allocator (preferred) or doing vm games
> as I suggested below.

Large non-fragmented packets are a totally different problem, and there
are various ways to handle them. Almost _all_ the ways are 100% preferable
to stupid VM games.

Just off the top of my head I'll give you two totally different approaches
which both work fine and avoid the VM games:

- simple approach:
Have the driver have a internal buffer of say 10-20 full packets
worth. If you are using HIPPI and expect to have good performance,
you can easily spare one megabyte of RAM just for temporary
network packets.

If the system cannot keep up and empty the (shortish) packet queue
quickly enough we start dropping packets, but we have to do that
at some point anyway.

- slightly more comple approach:
Have the driver create <4kB fragments and make a linked list of
them. This is essentially the same as the VM approach, except we
do it in software, which is going to be a lot simpler. We may
require that the headers be completely in the first packet, but
that isn't going to be much of a problem.

The advantage of the slightly more complex approach is that if it is done
right we can essentially use the same linked list code for _real_
fragmented packets as for "pseudo-fragmented" packets that are just the
side effect of memory management. That way we could handle real fragments
without coalescing them at all (except for maybe coalescing the first few
really small fragments to make sure we have the headers intact).

People who think that VM games are faster than a simple linked list need
to go out and do the math - it is infinitely more desirable to do the
memory management by hand both from a flexibility AND a speed standpoint.

Linus