Re: kernel > 2.1.36 & nfs

Linus Torvalds (torvalds@transmeta.com)
Tue, 3 Jun 1997 09:07:28 -0700 (PDT)


On Tue, 3 Jun 1997, Alan Cox wrote:
>
> > This re-assembly simplifies some code, but it not only has a bad impact
> > on memory management, it also involves a "useless" copy operation.
>
> That "useless" operation is somewhat temporary - its only useless because
> we dont reassemble and checksum in one pass. Also ATM already and HIPPI
> very soon will be giving us 65280 byte frames that are not fragmented.

The point is that the copy is _never_ needed - we'd be better off just
leaving the packets fragmented, and let the higher level protocols (tcp
and udp) take care of "reassembly" (ie copy them to user mode or to the
page cache and only _then_ do we make the DATA contiguous - never the
actual packet itself).

> What is really needed is something like
>
>
> buffer=vreserve(65536); /* Allocate 64K of address space */
>
> err=vfill(buffer, len, GFP_..); /* Put pages in where needed */
> if(err==-ENOMEM) /* No pages */
>
> vfree(buffer)

Nope. I do not believe that it is a good idea to use virtual memory for
stuff like this. SVR4 uses virtual memory for their page cache, and they
only gain complexity from it.

Using virtual memory is also extremely fragile with SMP. And packet
reception is definitly a multi-CPU issue (one CPU takes the actual
interrupt and creates the packet, but it's very possible that the packet
will be used on another CPU).

We don't handle SMP tlb's 100% correctly even for the _normal_ cases right
now - don't tell me it would be simple to fix the problems when you have
interrupt driven events that touch the TLB too. That's where I just say
"NO" (you can try to convince me, but you'll have a hard and rocky path
doing that).

For large packets (not fragments) we'll have to look into something else,
and it's possible that we could have some kind of "fragment list" for it.
That wouldn't be too hard, if only the upper layers knew of the
possibility that the packet (not the header) would be in multiple parts.

> The big problem we hit is that we need to rewrite a pile of DMA driven network
> drivers to do scatter gather or at least to spot problems in buffers they
> intend to DMA. Generally speaking it will be fine - something like

Hah. That's only _one_ of many problems with using virtual memory. Forget
the idea of virtual memory - you're just setting yourself up for more
problems than you really want.

Linus