Re: [ofa-general] Re: Demand paging for memory regions

From: Patrick Geoffray
Date: Tue Feb 12 2008 - 22:59:27 EST


Jason,

Jason Gunthorpe wrote:
I don't know much about Quadrics, but I would be hesitant to lump it
in too much with these RDMA semantics. Christian's comments sound like
they operate closer to what you described and that is why the have an
existing patch set. I don't know :)

The Quadrics folks have been doing RDMA for 10 years, there is a reason why they maintained a patch.

What it boils down to is that to implement true removal of pages in a
general way the kernel and HCA must either drop packets or stall
incoming packets, both are big performance problems - and I can't see
many users wanting this. Enterprise style people using SCSI, NFS, etc
already have short pin periods and HPC MPI users probably won't care
about the VM issues enough to warrent the performance overhead.

This is not true, HPC people do care about the VM issues a lot. Memory registration (pinning and translating) is usually too expensive to be performed in the critical path before and after each send or receive. So they factor it out by registering a buffer the first time it is used, and keeping it registered in a registration cache. However, the application may free() a buffer that is in the registration cache, so HPC people provide their own malloc to catch free(). They also try to catch sbrk() and munmap() to deregister memory before it is released to the OS. This is a Major pain that a VM notifier would easily solve. Being able to swap registered pages to disk or migrate them in a NUMA system is a welcome bonus.

Patrick
--
Patrick Geoffray
Myricom, Inc.
http://www.myri.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/