Re: usercopy whitelist woe in scsi_sense_cache

From: Paolo Valente
Date: Thu Apr 19 2018 - 05:33:13 EST

Next message: Chen-Yu Tsai: "[PATCH 0/6] drm/sun4i: Support color dithering for LCD panels"
Previous message: Chen-Yu Tsai: "[PATCH 6/6] [DO NOT MERGE] ARM: dts: sun7i: bananapi-m1-plus: Enable Bananapi 7" 800x480 RGB LCD panel"
In reply to: Jens Axboe: "Re: usercopy whitelist woe in scsi_sense_cache"
Next in thread: Kees Cook: "Re: usercopy whitelist woe in scsi_sense_cache"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

> Il giorno 18 apr 2018, alle ore 16:30, Jens Axboe <axboe@xxxxxxxxx> ha scritto:
>
> On 4/18/18 3:08 AM, Paolo Valente wrote:
>>
>>
>>> Il giorno 18 apr 2018, alle ore 00:57, Jens Axboe <axboe@xxxxxxxxx> ha scritto:
>>>
>>> On 4/17/18 3:48 PM, Jens Axboe wrote:
>>>> On 4/17/18 3:47 PM, Kees Cook wrote:
>>>>> On Tue, Apr 17, 2018 at 2:39 PM, Jens Axboe <axboe@xxxxxxxxx> wrote:
>>>>>> On 4/17/18 3:25 PM, Kees Cook wrote:
>>>>>>> On Tue, Apr 17, 2018 at 1:46 PM, Kees Cook <keescook@xxxxxxxxxxxx> wrote:
>>>>>>>> I see elv.priv[1] assignments made in a few places -- is it possible
>>>>>>>> there is some kind of uninitialized-but-not-NULL state that can leak
>>>>>>>> in there?
>>>>>>>
>>>>>>> Got it. This fixes it for me:
>>>>>>>
>>>>>>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>>>>>>> index 0dc9e341c2a7..859df3160303 100644
>>>>>>> --- a/block/blk-mq.c
>>>>>>> +++ b/block/blk-mq.c
>>>>>>> @@ -363,7 +363,7 @@ static struct request *blk_mq_get_request(struct
>>>>>>> request_queue *q,
>>>>>>>
>>>>>>> rq = blk_mq_rq_ctx_init(data, tag, op);
>>>>>>> if (!op_is_flush(op)) {
>>>>>>> - rq->elv.icq = NULL;
>>>>>>> + memset(&rq->elv, 0, sizeof(rq->elv));
>>>>>>> if (e && e->type->ops.mq.prepare_request) {
>>>>>>> if (e->type->icq_cache && rq_ioc(bio))
>>>>>>> blk_mq_sched_assign_ioc(rq, bio);
>>>>>>> @@ -461,7 +461,7 @@ void blk_mq_free_request(struct request *rq)
>>>>>>> e->type->ops.mq.finish_request(rq);
>>>>>>> if (rq->elv.icq) {
>>>>>>> put_io_context(rq->elv.icq->ioc);
>>>>>>> - rq->elv.icq = NULL;
>>>>>>> + memset(&rq->elv, 0, sizeof(rq->elv));
>>>>>>> }
>>>>>>> }
>>>>>>
>>>>>> This looks like a BFQ problem, this should not be necessary. Paolo,
>>>>>> you're calling your own prepare request handler from the insert
>>>>>> as well, and your prepare request does nothing if rq->elv.icq == NULL.
>>>>>
>>>>> I sent the patch anyway, since it's kind of a robustness improvement,
>>>>> I'd hope. If you fix BFQ also, please add:
>>>>
>>>> It's also a memset() in the hot path, would prefer to avoid that...
>>>> The issue here is really the convoluted bfq usage of insert/prepare,
>>>> I'm sure Paolo can take it from here.
>>>
>>
>> Hi,
>> I'm very sorry for tuning in very late, but, at the same time, very
>> glad to find the problem probably already solved ;) (in this respect, I swear,
>> my delay was not intentional)
>>
>>> Does this fix it?
>>>
>>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>>> index f0ecd98509d8..d883469a1582 100644
>>> --- a/block/bfq-iosched.c
>>> +++ b/block/bfq-iosched.c
>>> @@ -4934,8 +4934,11 @@ static void bfq_prepare_request(struct request *rq, struct bio *bio)
>>> bool new_queue = false;
>>> bool bfqq_already_existing = false, split = false;
>>>
>>> - if (!rq->elv.icq)
>>> + if (!rq->elv.icq) {
>>> + rq->elv.priv[0] = rq->elv.priv[1] = NULL;
>>> return;
>>> + }
>>> +
>>
>> This does solve the problem at hand. But it also arouses a question,
>> related to a possible subtle bug.
>>
>> For BFQ, !rq->elv.icq basically means "this request is not for me, as
>> I am an icq-based scheduler". But, IIUC the main points in this
>> thread, then this assumption is false. If it is actually false, then
>> I hope that all requests with !rq->elv.icq that are sent to BFQ do
>> verify the condition (at_head || blk_rq_is_passthrough(rq)). In fact,
>> requests that do not verify that condition are those that BFQ must put
>> in a bfq_queue. So, even if this patch makes the crash disappear, we
>> drive BFQ completely crazy (and we may expect other strange failures)
>> if we send BFQ a request with !((at_head || blk_rq_is_passthrough(rq))
>> and !rq->elv.icq. BFQ has to put that rq into a bfq_queue, but simply
>> cannot.
>>
>> Jens, or any other, could you please shed a light on this, and explain
>> how things are exactly?
>

First, thanks for summing up the problem.

> Your assumption is correct, however you set ->priv[0] and ->priv[1] for
> requests, but only for ->elv.icq != NULL. So let's assume you get a
> request and assign those two, request completes. Later on, you get
> the same request, bypass insert it. BFQ doesn't clear the bic/bfqq
> pointers in the request, since ->elv.icq == NULL.

I'm missing something here. When the request gets completed in the
first place, the hook bfq_finish_requeue_request gets called, and that
hook clears both ->elv.priv elements (as the request has a non-null
elv.icq). So, when bfq gets the same request again, those elements
must be NULL. What am I getting wrong?

I have some more concern on this point, but I'll stick to this for the
moment, to not create more confusion.

Thanks,
Paolo

> It gets inserted
> into the dispatch list.
>
> Then when __bfq_dispatch_request() is called, you do:
>
> bfqq = RQ_BFQQ(rq);
> if (bfqq)
> bfqq->dispatched++;
> [...]
>
> which is wrong, since you don't know if you assigned a bfqq for this
> request. The memory that bfqq points to could be long gone, if that
> queue is freed. So you could either guard any bfqq/bic retrieval
> with ->elv.icq != NULL, or you could just clear the pointers for
> the case where the values aren't valid.
>
> --
> Jens Axboe

Next message: Chen-Yu Tsai: "[PATCH 0/6] drm/sun4i: Support color dithering for LCD panels"
Previous message: Chen-Yu Tsai: "[PATCH 6/6] [DO NOT MERGE] ARM: dts: sun7i: bananapi-m1-plus: Enable Bananapi 7" 800x480 RGB LCD panel"
In reply to: Jens Axboe: "Re: usercopy whitelist woe in scsi_sense_cache"
Next in thread: Kees Cook: "Re: usercopy whitelist woe in scsi_sense_cache"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]