Re: [PATCH 6/6] fuse: Verify userspace asks to requeue interrupt that we really sent

From: Miklos Szeredi
Date: Wed Nov 07 2018 - 08:55:38 EST


On Tue, Nov 6, 2018 at 10:31 AM, Kirill Tkhai <ktkhai@xxxxxxxxxxxxx> wrote:
> When queue_interrupt() is called from fuse_dev_do_write(),
> it came from userspace directly. Userspace may pass any
> request id, even the request's we have not interrupted
> (or even background's request). This patch adds sanity
> check to make kernel safe against that.

Okay, I understand this far.

> The problem is real interrupt may be queued and requeued
> by two tasks on two cpus. This case, the requeuer has not
> guarantees it sees FR_INTERRUPTED bit on its cpu (since
> we know nothing about the way userspace manages requests
> between its threads and whether it uses smp barriers).

This sounds like BS. There's an explicit smp_mb__after_atomic()
after the set_bit(FR_INTERRUPTED,...). Which means FR_INTERRUPTED is
going to be visible on all CPU's after this, no need to fool around
with setting FR_INTERRUPTED again, etc...



>
> To eliminate this problem, queuer writes FR_INTERRUPTED
> bit again under fiq->waitq.lock, and this guarantees
> requeuer will see the bit, when checks it.
>
> I try to introduce solution, which does not affect on
> performance, and which does not force to take more
> locks. This is the reason, the below solution is worse:
>
> request_wait_answer()
> {
> ...
> + spin_lock(&fiq->waitq.lock);
> set_bit(FR_INTERRUPTED, &req->flags);
> + spin_unlock(&fiq->waitq.lock);
> ...
> }
>
> Also, it does not look a better idea to extend fuse_dev_do_read()
> with the fix, since it's already a big function:
>
> fuse_dev_do_read()
> {
> ...
> if (test_bit(FR_INTERRUPTED, &req->flags)) {
> + /* Comment */
> + barrier();
> + set_bit(FR_INTERRUPTED, &req->flags);
> queue_interrupt(fiq, req);
> }
> ...
> }
>
> Signed-off-by: Kirill Tkhai <ktkhai@xxxxxxxxxxxxx>
> ---
> fs/fuse/dev.c | 26 +++++++++++++++++++++-----
> 1 file changed, 21 insertions(+), 5 deletions(-)
>
> diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
> index 315d395d5c02..3bfc5ed61c9a 100644
> --- a/fs/fuse/dev.c
> +++ b/fs/fuse/dev.c
> @@ -475,13 +475,27 @@ static void request_end(struct fuse_conn *fc, struct fuse_req *req)
> fuse_put_request(fc, req);
> }
>
> -static void queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
> +static int queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
> {
> bool kill = false;
>
> if (test_bit(FR_FINISHED, &req->flags))
> - return;
> + return 0;
> spin_lock(&fiq->waitq.lock);
> + /* Check for we've sent request to interrupt this req */
> + if (unlikely(!test_bit(FR_INTERRUPTED, &req->flags))) {
> + spin_unlock(&fiq->waitq.lock);
> + return -EINVAL;
> + }
> + /*
> + * Interrupt may be queued from fuse_dev_do_read(), and
> + * later requeued on other cpu by fuse_dev_do_write().
> + * To make FR_INTERRUPTED bit visible for the requeuer
> + * (under fiq->waitq.lock) we write it once again.
> + */
> + barrier();
> + __set_bit(FR_INTERRUPTED, &req->flags);
> +
> if (list_empty(&req->intr_entry)) {
> list_add_tail(&req->intr_entry, &fiq->interrupts);
> /*
> @@ -492,7 +506,7 @@ static void queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
> if (test_bit(FR_FINISHED, &req->flags)) {
> list_del_init(&req->intr_entry);
> spin_unlock(&fiq->waitq.lock);
> - return;
> + return 0;
> }
> wake_up_locked(&fiq->waitq);
> kill = true;
> @@ -500,6 +514,7 @@ static void queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
> spin_unlock(&fiq->waitq.lock);
> if (kill)
> kill_fasync(&fiq->fasync, SIGIO, POLL_IN);
> + return (int)kill;
> }
>
> static void request_wait_answer(struct fuse_conn *fc, struct fuse_req *req)
> @@ -1959,8 +1974,9 @@ static ssize_t fuse_dev_do_write(struct fuse_dev *fud,
> nbytes = -EINVAL;
> else if (oh.error == -ENOSYS)
> fc->no_interrupt = 1;
> - else if (oh.error == -EAGAIN)
> - queue_interrupt(&fc->iq, req);
> + else if (oh.error == -EAGAIN &&
> + queue_interrupt(&fc->iq, req) < 0)
> + nbytes = -EINVAL;
>
> fuse_put_request(fc, req);
> fuse_copy_finish(cs);
>