Re: Integration of SCST in the mainstream Linux kernel

From: Jeff Garzik
Date: Mon Feb 04 2008 - 17:58:24 EST


Linus Torvalds wrote:
So no, performance is not the only reason to move to kernel space. It can easily be things like needing direct access to internal data queues (for a iSCSI target, this could be things like barriers or just tagged commands - yes, you can probably emulate things like that without access to the actual IO queues, but are you sure the semantics will be entirely right?

The kernel/userland boundary is not just a performance boundary, it's an abstraction boundary too, and these kinds of protocols tend to break abstractions. NFS broke it by having "file handles" (which is not something that really exists in user space, and is almost impossible to emulate correctly), and I bet the same thing happens when emulating a SCSI target in user space.

Well, speaking as a complete nutter who just finished the bare bones of an NFSv4 userland server[1]... it depends on your approach.

If the userland server is the _only_ one accessing the data[2] -- i.e. the database server model where ls(1) shows a couple multi-gigabyte files or a raw partition -- then it's easy to get all the semantics right, including file handles. You're not racing with local kernel fileserving.

Couple that with sendfile(2), sync_file_range(2) and a few other Linux-specific syscalls, and you've got an efficient NFS file server.

It becomes a solution similar to Apache or MySQL or Oracle.


I quite grant there are many good reasons to do NFS or iSCSI data path in the kernel... my point is more that "impossible" is just from one point of view ;-)


Maybe not. I _rally_ haven't looked into iSCSI, I'm just guessing there would be things like ordering issues.

iSCSI and NBD were passe ideas at birth. :)

Networked block devices are attractive because the concepts and implementation are more simple than networked filesystems... but usually you want to run some sort of filesystem on top. At that point you might as well run NFS or [gfs|ocfs|flavor-of-the-week], and ditch your networked block device (and associated complexity).

iSCSI is barely useful, because at least someone finally standardized SCSI over LAN/WAN.

But you just don't need its complexity if your filesystem must have its own authentication, distributed coordination, multiple-connection management code of its own.

Jeff


P.S. Clearly my NFSv4 server is NOT intended to replace the kernel one. It's more for experiments, and doing FUSE-like filesystem work.


[1] http://linux.yyz.us/projects/nfsv4.html

[2] well, outside of dd(1) and similar tricks... the same "going around its back" tricks that can screw up a mounted filesystem.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/