Re: [PATCH v2] staging/lustre: lloop depends on BLOCK

From: Dilger, Andreas
Date: Wed Aug 07 2013 - 03:45:46 EST


On 2013/08/02 4:49 AM, "Christoph Hellwig" <hch@xxxxxxxxxxxxx> wrote:
>On Thu, Aug 01, 2013 at 07:57:22PM +0000, Dilger, Andreas wrote:
>> It provides significant performance improvement for network IO on
>>Lustre.
>> It bypasses DLM locking in Lustre and the VFS layer on the client,
>>copying
>> in the loop driver, and page-by-page IO submission in the normal IO
>>path.
>
>Part of being upstream is improving existing drivers instead of copy and
>pasting them. Please take a Look at Shaggys in-kernel direct I/O and
>loop improvements and submit any incremental improvements ontop of that
>one.

The problem still remains that the kernel loop driver eventually depends on
a local block device for the pages/bios to be written. The Lustre lloop
driver bypasses the VFS and block layer to generate RPCs from the submitted
pages to RDMA over the network without a data copy.

I wouldn't think that anyone would want a Lustre-specific RPC engine in the
standard loop.c file, nor would it be practical due to symbol dependencies.

I could imagine being able to register do_bio_lustrebacked() as a BIO
submission
routine instead of do_bio_filebacked(), along with some private data to
link
the loop file to the underlying storage (in Lustre's case an object layout
and
a preallocated I for generating the RPC).

How to register this from userspace compared to a normal file-backed loop
might be a bit tricky. Lustre uses its own ioctls to register/deregister
the
device, though I could imagine some kind of per-file(system) method table
for
specifying the underlying bio submission routine.

In any case, rewriting the lloop/loop driver to merge this functionality
is not
high on the priority list compared to other Lustre code that needs to be
cleaned
up before it can be merged into the main kernel tree. Can we leave this
code
as-is for now, and decide whether to rewrite or remove it when we are
closer to
having the rest of the code cleaned up?

Cheers, Andreas
--
Andreas Dilger

Lustre Software Architect
Intel High Performance Data Division


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/