Re: ANNOUCE: OpenIB InfiniBand software

From: Roland Dreier
Date: Fri Mar 12 2004 - 10:23:32 EST


Troy> Since the openib.org mailing lists don't seem to be alive
Troy> yet, I'll bring this up here..

Troy> Can we get this split out into the following components?
Troy> * kernel level infiniband access layer
Troy> * lowlevel hardware driver (aka mellanox driver)
Troy> * all other 'upper layer protocols'

Right now, in the infiniband-kernel package, the Mellanox driver is
under drivers/infiniband/hw/mellanox-hca, the 'upper layer protocols'
are under drivers/infiniband/ulp, and the 'access layer' is everything
else in drivers/infiniband. I'm not sure it's worth rolling three
packages for that split right now.

Troy> All the existing code has big problems with how it
Troy> interfaces to the kernel memory management.. it doesn't use
Troy> the regular interfaces like get_user_pages, pci_map_page,
Troy> and pci_map_sq, and goes and looks at the guts of the
Troy> pagetables directly.

Troy> This is either a problem with the kernel interfaces made
Troy> available, or a design issue with the existing
Troy> codebases. Can someone please tell me which it is?

I would say both. Certainly the HCA driver code could have been
written much more cleanly. However, it's not clear how the right way
to handle the memory management requirements of kernel bypass/RDMA
protocols, especially since the driver tries to support kernels all
the way back to the ancient 2.4.9 used by Red Hat AS 2.1 (which
doesn't even have get_user_pages()).

In any case, it's why I put the following items in the TODO list
of my announcement:

* Use DMA mapping API so we work properly with non-coherent
caches, IOMMUs etc. Add a function to get struct device * for an
IB device so ULPs can call sync functions (embed struct device
in struct ib_device?). (How do we handle systems with limited
DMA mapping capacity?)

* In Mellanox HCA driver, clean up mosal. Some obvious
targets (there's lots more though):
o Grunging through kernel code to find the mlock/munlock
pointers especially has to go (mosal_mlock.c). We need to
get VM experts from LKML to tell us how to handle memory
translation and locking.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/