Re: [RFC, PATCH] Extensible AIO interface
From: Jeff Moyer
Date: Tue Oct 02 2012 - 13:41:53 EST
Kent Overstreet <koverstreet@xxxxxxxxxx> writes:
> So, I and other people keep running into things where we really need to
> add an interface to pass some auxiliary... stuff along with a pread() or
> pwrite().
>
> A few examples:
>
> * IO scheduler hints. Some userspace program wants to, per IO, specify
> either priorities or a cgroup - by specifying a cgroup you can have a
> fileserver in userspace that makes use of cfq's per cgroup bandwidth
> quotas.
You can do this today by splitting I/O between processes and placing
those processes in different cgroups. For io priority, there is
ioprio_set, which incurs an extra system call, but can be used. Not
elegant, but possible.
> * Cache hints. For bcache and other things, userspace may want to specify
> "this data should be cached", "this data should bypass the cache", etc.
Please explain how you will differentiate this from posix_fadvise.
> * Passing checksums out to userspace. We've got bio integrity, which is
> a (somewhat) generic interface for passing data checksums between the
> filesystem and the hardware. There are various circumstances under which
> you may want to pass these checksums out to userspace, and if so we
> ought to have a generic way of doing it.
Yes, that needs a new interface.
> Hence, AIO attributes.
*No.* Start with the non-AIO case first.
> * FUTURE STUFF:
>
> Return values:
>
> Some attributes are probably going to want to return something to
> userspace.
>
> If nothing else, we want this so that userspace can tell if anything
> handled the attributes it specified - as dynamic as the io stack can be,
> with something extensible like this there really isn't any generic way
> of knowing ahead of time if something is going to interpret any
> attribute - we want to return at least an error code.
Seems odd to me. Why not expose supported attributes via some other
call? fcntl?
> One could imagine sticking the return in the attribute itself, but I
> don't want to do this. For some things (checksums), the attribute will
> contain a pointer to a buffer - that's fine. But I don't want the
> attributes themselves to be writeable.
One could imagine that attributes don't return anything, because, well,
they're properties of something else, and properties don't return
anything.
Cheers,
Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/