Re: [PATCH 6/6] fuse: Verify userspace asks to requeue interrupt that we really sent

From: Kirill Tkhai
Date: Wed Nov 07 2018 - 09:26:00 EST


On 07.11.2018 16:55, Miklos Szeredi wrote:
> On Tue, Nov 6, 2018 at 10:31 AM, Kirill Tkhai <ktkhai@xxxxxxxxxxxxx> wrote:
>> When queue_interrupt() is called from fuse_dev_do_write(),
>> it came from userspace directly. Userspace may pass any
>> request id, even the request's we have not interrupted
>> (or even background's request). This patch adds sanity
>> check to make kernel safe against that.
>
> Okay, I understand this far.
>
>> The problem is real interrupt may be queued and requeued
>> by two tasks on two cpus. This case, the requeuer has not
>> guarantees it sees FR_INTERRUPTED bit on its cpu (since
>> we know nothing about the way userspace manages requests
>> between its threads and whether it uses smp barriers).
>
> This sounds like BS. There's an explicit smp_mb__after_atomic()
> after the set_bit(FR_INTERRUPTED,...). Which means FR_INTERRUPTED is
> going to be visible on all CPU's after this, no need to fool around
> with setting FR_INTERRUPTED again, etc...

Hm, but how does it make the bit visible on all CPUS?

The problem is that smp_mb_xxx() barrier on a cpu has a sense
only in pair with the appropriate barrier on the second cpu.
There is no guarantee for visibility, if second cpu does not
have a barrier:

CPU 1 CPU2 CPU3 CPU4 CPU5

set FR_INTERRUPTED set FR_SENT
<smp mb> <smp mb>
test FR_SENT (== 0) test FR_INTERRUPTED (==1)
list_add[&req->intr_entry] read[req by intr_entry]
<place to insert a barrier>
goto userspace
write in userspace buffer
read from userspace buffer
write to userspace buffer
read from userspace buffer
enter kernel
<place to insert a barrier>
test FR_INTERRUPTED <- Not visible

The sequence:

set_bit(FR_INTERRUPTED, ...)
smp_mb__after_atomic();
test_bit(FR_SENT, &req->flags)

just guarantees the expected order on CPU2, which uses <smp mb>,
but CPU5 does not have any guarantees.


This 5 CPUs picture is a scheme, which illustrates the possible way userspace
may manage interrupts. Tags <place to insert a barrier> show places, where
we not have barriers yet, but where we may introduce them. But even in case
of we introduce them, there is no a way, that such the barriers help against CPU4.
So, this is the reason we have to set FR_INTERRUPTED bit again to make it visible
under the spinlock on CPU5.

Thanks,
Kirill

>>
>> To eliminate this problem, queuer writes FR_INTERRUPTED
>> bit again under fiq->waitq.lock, and this guarantees
>> requeuer will see the bit, when checks it.
>>
>> I try to introduce solution, which does not affect on
>> performance, and which does not force to take more
>> locks. This is the reason, the below solution is worse:
>>
>> request_wait_answer()
>> {
>> ...
>> + spin_lock(&fiq->waitq.lock);
>> set_bit(FR_INTERRUPTED, &req->flags);
>> + spin_unlock(&fiq->waitq.lock);
>> ...
>> }
>>
>> Also, it does not look a better idea to extend fuse_dev_do_read()
>> with the fix, since it's already a big function:
>>
>> fuse_dev_do_read()
>> {
>> ...
>> if (test_bit(FR_INTERRUPTED, &req->flags)) {
>> + /* Comment */
>> + barrier();
>> + set_bit(FR_INTERRUPTED, &req->flags);
>> queue_interrupt(fiq, req);
>> }
>> ...
>> }
>>
>> Signed-off-by: Kirill Tkhai <ktkhai@xxxxxxxxxxxxx>
>> ---
>> fs/fuse/dev.c | 26 +++++++++++++++++++++-----
>> 1 file changed, 21 insertions(+), 5 deletions(-)
>>
>> diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
>> index 315d395d5c02..3bfc5ed61c9a 100644
>> --- a/fs/fuse/dev.c
>> +++ b/fs/fuse/dev.c
>> @@ -475,13 +475,27 @@ static void request_end(struct fuse_conn *fc, struct fuse_req *req)
>> fuse_put_request(fc, req);
>> }
>>
>> -static void queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
>> +static int queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
>> {
>> bool kill = false;
>>
>> if (test_bit(FR_FINISHED, &req->flags))
>> - return;
>> + return 0;
>> spin_lock(&fiq->waitq.lock);
>> + /* Check for we've sent request to interrupt this req */
>> + if (unlikely(!test_bit(FR_INTERRUPTED, &req->flags))) {
>> + spin_unlock(&fiq->waitq.lock);
>> + return -EINVAL;
>> + }
>> + /*
>> + * Interrupt may be queued from fuse_dev_do_read(), and
>> + * later requeued on other cpu by fuse_dev_do_write().
>> + * To make FR_INTERRUPTED bit visible for the requeuer
>> + * (under fiq->waitq.lock) we write it once again.
>> + */
>> + barrier();
>> + __set_bit(FR_INTERRUPTED, &req->flags);
>> +
>> if (list_empty(&req->intr_entry)) {
>> list_add_tail(&req->intr_entry, &fiq->interrupts);
>> /*
>> @@ -492,7 +506,7 @@ static void queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
>> if (test_bit(FR_FINISHED, &req->flags)) {
>> list_del_init(&req->intr_entry);
>> spin_unlock(&fiq->waitq.lock);
>> - return;
>> + return 0;
>> }
>> wake_up_locked(&fiq->waitq);
>> kill = true;
>> @@ -500,6 +514,7 @@ static void queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
>> spin_unlock(&fiq->waitq.lock);
>> if (kill)
>> kill_fasync(&fiq->fasync, SIGIO, POLL_IN);
>> + return (int)kill;
>> }
>>
>> static void request_wait_answer(struct fuse_conn *fc, struct fuse_req *req)
>> @@ -1959,8 +1974,9 @@ static ssize_t fuse_dev_do_write(struct fuse_dev *fud,
>> nbytes = -EINVAL;
>> else if (oh.error == -ENOSYS)
>> fc->no_interrupt = 1;
>> - else if (oh.error == -EAGAIN)
>> - queue_interrupt(&fc->iq, req);
>> + else if (oh.error == -EAGAIN &&
>> + queue_interrupt(&fc->iq, req) < 0)
>> + nbytes = -EINVAL;
>>
>> fuse_put_request(fc, req);
>> fuse_copy_finish(cs);
>>