Re: [PATCH 2 of 4] Introduce i386 fibril scheduling

From: Zach Brown
Date: Tue Feb 06 2007 - 17:29:42 EST


That's not how the patches work right now, but yes, I at least personally
think that it's something we should aim for (ie the interface shouldn't
_require_ us to always wait for things even if perhaps an early
implementation might make everything be delayed at first)

I agree that we shouldn't require a seperate syscall just to get the return code from ops that didn't block.

It doesn't seem like much of a stretch to imagine a setup where we can specify completion context as part of the submission itself.

declare_empty_ring(ring);
struct submission sub;

sub.ring = ˚
sub.nr = SYS_fstat64;
sub.args == ...

ret = submit(&sub, 1);
if (ret == 0) {
wait_for_elements(&ring, 1);
printf("stat gave %d\n", ring[ring->head].rc);
}

You get the idea, it's just an outline.

wait_for_elements() could obviously check the ring before falling back to kernel sync. I'm pretty keen on the notion of producer/ consumer rings where userspace writes the head as it plucks completions and the kernel writes the tail as it adds them.

We might want per-call ring pointers, instead of per submission, to help submitters wait for a group of ops to complete without having to do their own tracking on event completion. That only makes sense if we have the waiting mechanics let you only be woken as the number of events in the ring crosses some threshold. Which I think we want anyway.

We'd be trading building up a specific completion state with syscalls for some complexity during submission that pins (and kmaps on completion) the user pages. Submission could return failure if pinning these new pages would push us over some rlimit. We'd have to be *awfully* careful not to let userspace corrupt (munmap?) the ring and confuse the hell out of the kernel.

Maybe not worth it, but if we *really* cared about making the non- blocking case almost identical to the sync case and wanted to use the same interface for batch submission and async completion then this seems like a possibility.

- z
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/