Re: [PATCH 1/8] aio: make sure file is pinned
From: Al Viro
Date: Thu Mar 07 2019 - 22:37:09 EST
On Wed, Mar 06, 2019 at 05:30:21PM -0800, Linus Torvalds wrote:
> On Wed, Mar 6, 2019 at 5:20 PM Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > I'll try to massage that series on top of your patch; I still hate the
> > post-vfs_poll() logics in aio_poll() ;-/ Give me about half an hour
> > and I'll have something to post.
>
> No inherent hurry, I sent the ping just to make sure it hadn't gotten lost.
>
> And yeah, I think the post-vfs_poll() logic cannot possibly be
> necessary. My gut feel is that *if* we have the refcounting right,
> then we should be able to just let the wakeup come in at any later
> point, and ordering shouldn't matter all that much, and we shouldn't
> even need any locking.
>
> I'd like to think that it can be done with something like "just 'or'
> in the mask atomically" (so that we don't care about ordering between
> the synchronous vfs_poll() and the async poll wakeup), together with
> "when refcount goes to zero, finish the thing off and complete it" (so
> that we don't care who finishes first).
>
> No "woken" logic, no "who fired first" logic, no BS. Just make the
> operations work regardless of ordering.
>
> And maybe it can't be done. But the current model seems just so hacky
> that it can't be the right model.
Umm... It is kinda-sorta doable; we do need something vaguely similar
to ->woken ("should we add it to the list of cancellables, or is the
async reference already gone?"), but other than that it seems to be
feasible.
See vfs.git#work.aio; the crucial bits are in these commits:
keep io_event in aio_kiocb
get rid of aio_complete() res/res2 arguments
move aio_complete() to final iocb_put(), try to fix aio_poll() logics
The first two are preparations, the last is where the fixes (hopefully)
happen.
The logics in aio_poll() after vfs_poll():
* we might want to steal the async reference (e.g. due to event
returned from the very beginning, or due to attempt to put on more than
one waitqueue, which makes results unreliable). That's _NOT_ possible
if the thing had been put on a waitqueue, but currently isn't there.
It might be either due to early wakeup having done everything or the
same having scheduled aio_poll_complete_work(). In either case, the
best we can do is to ignore the return value of vfs_poll() and, in
case of error, mark the sucker cancelled. We *can't* return an error
in that case.
* if we want and can steal the async reference, rip it from
waitqueue; otherwise, put it on the "cancellable" list, unless it's
already gone or unless we are simulating the cancel ourselves.
* if vfs_poll() has reported something we want and we have
successufully stolen the iocb, put it there, have the reference
we'd taken over dropped and return 0
Comments?