Re: [PATCH v1 14/15] scsi: ufs: commit descriptors before setting the doorbell

From: ygardi
Date: Sun Aug 30 2015 - 05:51:22 EST


> On Thu, Aug 27, 2015 at 7:11 AM, <ygardi@xxxxxxxxxxxxxx> wrote:
>>> On Tue, Aug 25, 2015 at 7:36 AM, <ygardi@xxxxxxxxxxxxxx> wrote:
>>>>> On Aug 21, 2015 3:10 PM, "Yaniv Gardi" <ygardi@xxxxxxxxxxxxxx> wrote:
>>>>>>
>>>>>> Add a write memory barrier to make sure descriptors prepared are
>>>>>> actually
>>>>>> written to memory before ringing the doorbell. We have also added
>>>>>> the
>>>>>> write memory barrier after ringing the doorbell register so that
>>>>>> controller sees the new request immediately.
>>>>>>
>>>>>> Signed-off-by: Yaniv Gardi <ygardi@xxxxxxxxxxxxxx>
>>>>>>
>>>>>> ---
>>>>>> drivers/scsi/ufs/ufshcd.c | 6 ++++++
>>>>>> 1 file changed, 6 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
>>>>>> index fef0660..876148b 100644
>>>>>> --- a/drivers/scsi/ufs/ufshcd.c
>>>>>> +++ b/drivers/scsi/ufs/ufshcd.c
>>>>>> @@ -833,6 +833,8 @@ void ufshcd_send_command(struct ufs_hba *hba,
>>>>>> unsigned int task_tag)
>>>>>> ufshcd_clk_scaling_start_busy(hba);
>>>>>> __set_bit(task_tag, &hba->outstanding_reqs);
>>>>>> ufshcd_writel(hba, 1 << task_tag,
>>>>>> REG_UTP_TRANSFER_REQ_DOOR_BELL);
>>>>>> + /* Make sure that doorbell is committed immediately */
>>>>>> + wmb();
>>>>>
>>>>> Is this really necessary? Is there a measurable difference?
>>>>
>>>> I'm not sure if there is a measurable difference, but as the Door-Bell
>>>> register is the one that actually responsible for the HW execution of
>>>> the
>>>> requests, anyhow, it's recommended to its value will be written
>>>> instantly to the memory.
>>>
>>> A barrier doesn't guarantee speed, only ordering. Unless you can
>>> measure the difference, you should not have it.
>>
>> Rob,
>> let me have an example:
>> context#1 updates outstanding_reqs variable and write(DOOR_BELL)
>> context#2 upon interrupt of a request completion the following happens:
>> report completion on each one of the bits in:
>> outstanding_reqs ^ read(DOOR_BELL);
>>
>> 0. let's assume the DOOR_BELL = 0x1 (which means 1 active request in
>> slot 0)
>> 1. context#1: update the DOOR_BELL to be 0x3; (2 active requests: in
>> slot
>> 0 and 1)
>> 2. the new value 0x3 is still not written to the DR so DORR_BELL is
>> still
>> 0x1, but outstanding_reqs is already updated = 0x3
>> 3. the request in slot 0 just completed, and interrupt happens, so
>> DORR_BELL is now 0 (request in slot 0 completed)
>> 4. context#2: outstanding_reqs ^ read(DOOR_BELL) = 0x3 ^ 0x0 = 0x3 =>
>> wrong conclusion since the request in slot 1 never completed, and
>> actually
>> never started.
>
> Barriers alone will never solve this problem. They may narrow the
> window possibly, but the problem is still there. What you have to have
> is a spinlock around all accesses to both outstanding_reqs and
> doorbell register. And guess what, spinlocks have appropriate barriers
> to ensure visibility of what they protect. Or perhaps the h/w provides
> another way to signal what slots have completed. Using the same
> register for doorbell and completion status is not ideal.
>

can i assume spin_lock_irqsave() and spin_unlock_irqrestore()
both provide barriers ? i couldn't find the barrier instruction
when following the call chain...


> Rob
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/