Re: [PATCH v3 0/2] block,scsi: fixup blk_get_request dead queue scenarios

From: Jens Axboe
Date: Tue Aug 26 2014 - 17:37:14 EST


On 08/26/2014 03:33 PM, Jens Axboe wrote:
> On 08/26/2014 03:27 PM, Jeff Moyer wrote:
>> Jens Axboe <axboe@xxxxxxxxx> writes:
>>
>>> On 08/26/2014 11:24 AM, Jeff Moyer wrote:
>>>> Joe Lawrence <joe.lawrence@xxxxxxxxxxx> writes:
>>>>
>>>>> v2->v3: rebase to 3.16-rc2, consider return values from the
>>>>> blk_mq_alloc_request leg of the blk_get_request callchain
>>>>> (noted by Jeff), noted in the second patch changelog.
>>>>>
>>>>> blk_mq_queue_enter may return 0 or errno, which
>>>>> blk_mq_alloc_request can propogate out via ERR_PTR.
>>>>> __blk_mq_alloc_request doesn't include any blk_queue_dying
>>>>> checks, so I'm assuming that its failures can be attributed
>>>>> to -EWOULDBLOCK under !GFP_WAIT conditions.
>>>>>
>>>>> v1->v2: incorporate Jeff's feedback in bsg_map_hdr() and Reviewed-by
>>>>> tags.
>>>>>
>>>>> Joe Lawrence (2):
>>>>> block,scsi: verify return pointer from blk_get_request
>>>>> block,scsi: fixup blk_get_request dead queue scenarios
>>>>
>>>> Jens,
>>>>
>>>> Did this patch set fall through the cracks again?
>>>
>>> Falling through the cracks implies that I meant to apply it and did not,
>>> which was not the case.
>>
>> Sorry, I was mislead by our earlier conversation on this (mail inline
>> below).
>
> I changed my mind, it didn't feel fully baked to me.
>
>>> But I think we're at the point now where I'm finally comfortable with
>>> applying it. So, Joe, could you ensure that it applies to 3.17-rc2,
>>> then I will roll it in to the updates for 3.18.
>>
>> Joe, you will have one hunk to modify for sure, in scsi_ioctl.c. A
>> previous patch added a check for null, but ended up returning the wrong
>> value (ENOMEM instead of ENODEV).
>
> I have applied the first one, will look over the second one and hand
> apply it. Seems the NULL return was completely removed, so we _should_
> be ok on the IS_ERR() conversion, though that sort of thing always
> worries me a little bit. A NULL return can quickly show up again, and
> then they would all fail.

Additionally, there's still quite a few places that call
blk_get_request() and don't check the error return if __GFP_WAIT is set.
Since most of the point of this is to fix segfaulting on queue dead
scenarios, why aren't they all converted?

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/