Re: [PATCH] UML: UBD: Fix for processes stuck in D state forever in UserModeLinux

From: Richard Weinberger
Date: Tue Sep 16 2014 - 12:50:26 EST


On Mon, Aug 25, 2014 at 4:04 PM, Richard Weinberger <richard@xxxxxx> wrote:
> Am 25.08.2014 16:00, schrieb Thorsten Knabe:
>> On 08/25/2014 03:25 PM, Richard Weinberger wrote:
>>> Am 25.08.2014 01:02, schrieb Thorsten Knabe:
>>>> On 08/24/2014 02:11 PM, Richard Weinberger wrote:
>>>>> Am 23.08.2014 19:43, schrieb Thorsten Knabe:
>>>>>> Hi Richard.
>>>>>>
>>>>>> On 08/23/2014 05:34 PM, Richard Weinberger wrote:
>>>>>>> Hi!
>>>>>>>
>>>>>>> Am 23.08.2014 15:47, schrieb Thorsten Knabe:
>>>>>>>> From: Thorsten Knabe <linux@xxxxxxxxxxxxxxxxx>
>>>>>>>>
>>>>>>>> UML: UBD: Fix for processes stuck in D state forever in UserModeLinux.
>>>>>>>>
>>>>>>>> Starting with Linux 3.12 processes get stuck in D state forever in
>>>>>>>> UserModeLinux under sync heavy workloads. This bug was introduced by
>>>>>>>> commit 805f11a0d5 (um: ubd: Add REQ_FLUSH suppport).
>>>>>>>> Fix bug by adding a check if FLUSH request was successfully submitted to
>>>>>>>> the I/O thread and keeping the FLUSH request on the request queue on
>>>>>>>> submission failures.
>>>>>>>>
>>>>>>>> Fixes: 805f11a0d5 (um: ubd: Add REQ_FLUSH suppport)
>>>>>>>> Signed-off-by: Thorsten Knabe <linux@xxxxxxxxxxxxxxxxx>
>>>>>>>
>>>>>>> Thanks a lot for hunting this issue down.
>>>>>>>
>>>>>>>> ---
>>>>>>>> Patch applies to 3.16.1.
>>>>>>>>
>>>>>>>> diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c
>>>>>>>> index 3716e69..b7d2840 100644
>>>>>>>> --- a/arch/um/drivers/ubd_kern.c
>>>>>>>> +++ b/arch/um/drivers/ubd_kern.c
>>>>>>>> @@ -1277,7 +1277,7 @@ static void do_ubd_request(struct request_queue *q)
>>>>>>>>
>>>>>>>> while(1){
>>>>>>>> struct ubd *dev = q->queuedata;
>>>>>>>> - if(dev->end_sg == 0){
>>>>>>>> + if(dev->request == NULL){
>>>>>>>
>>>>>>> Why do we need this specific change?
>>>>>>
>>>>>> This change is required, because for FLUSH requests dev->end_sg is
>>>>>> initialized to 0 by blk_rq_map_sg() a few lines above, as FLUSH requests
>>>>>> have no data blocks attached to themselves.
>>>>>
>>>>> You meant "below"? Looks like I really miss something here.
>>>>> At the bottom of the while(1) loop we have
>>>>> dev->end_sg = 0;
>>>>> dev->request = NULL;
>>>>
>>>> No. The problematic line is:
>>>> dev->end_sg = blk_rq_map_sg(q, req, dev->sg);
>>>> and blk_rq_map_sg() returning 0 for REQ_FLUSH requests, because they
>>>> have no associated data blocks.
>>>>
>>>> Hence on the next iteration of the while(1) loop:
>>>> if(dev->end_sg == 0){
>>>
>>> At the bottom of the while loop dev->end_sg will be set to 0 anyway, this is what puzzles
>>> me so hard.
>>> (And dev->request too)
>>
>> Yes, but the statements:
>> dev->end_sg = 0;
>> dev->request = NULL;
>> at the end of the while loop will only be reached (for FLUSH requests
>> only with my patch applied) when submit_request() succeeds. Otherwise
>> the function do_ubd_request() returns early leaving dev->end_sg and
>> dev->request untouched.
>
> Can you please add this info into the commit message, I'm sure others get also
> confused by that.^^
> The logic in do_ubd_request() royal mess which needs fixing too. :-\

Hmm, you didn't resend the patch.
I'll amend the commit message myself and push it to Linus this week.

--
Thanks,
//richard
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/