RE: [PATCH v2] hv: hv_fcopy: drop the obsolete message on transfer failure

From: Jason Wang
Date: Thu Nov 27 2014 - 04:01:30 EST




On Thu, Nov 27, 2014 at 4:50 PM, Dexuan Cui <decui@xxxxxxxxxxxxx> wrote:
-----Original Message-----
From: Jason Wang [mailto:jasowang@xxxxxxxxxx]
Sent: Thursday, November 27, 2014 15:15 PM
To: Dexuan Cui
Cc: gregkh@xxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; driverdev-
devel@xxxxxxxxxxxxxxxxxxxxxx; olaf@xxxxxxxxx; apw@xxxxxxxxxxxxx; KY
Srinivasan; vkuznets@xxxxxxxxxx; Haiyang Zhang
Subject: Re: [PATCH v2] hv: hv_fcopy: drop the obsolete message on transfer
failure
----- Original Message -----
> In the case the user-space daemon crashes, hangs or is killed, we
> need to down the semaphore, otherwise, after the daemon starts next
> time, the obsolete data in fcopy_transaction.message or
> fcopy_transaction.fcopy_msg will be used immediately.
>
> Reviewed-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
> Cc: K. Y. Srinivasan <kys@xxxxxxxxxxxxx>
> Signed-off-by: Dexuan Cui <decui@xxxxxxxxxxxxx>
> ---
>
> v2: I removed the "FCP" prefix as Greg asked.
>
> I also updated the output message a little:
> "FCP: failed to acquire the semaphore" -->
> "can not acquire the semaphore: it is benign"
>
> drivers/hv/hv_fcopy.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/drivers/hv/hv_fcopy.c b/drivers/hv/hv_fcopy.c
> index 23b2ce2..c518ad9 100644
> --- a/drivers/hv/hv_fcopy.c
> +++ b/drivers/hv/hv_fcopy.c
> @@ -86,6 +86,15 @@ static void fcopy_work_func(struct work_struct
*dummy)
> * process the pending transaction.
> */
> fcopy_respond_to_host(HV_E_FAIL);
> +
> + /* In the case the user-space daemon crashes, hangs or is killed, we
> + * need to down the semaphore, otherwise, after the daemon starts
next
> + * time, the obsolete data in fcopy_transaction.message or
> + * fcopy_transaction.fcopy_msg will be used immediately.
> + */
Looks still racy, what happens if the daemon start before down_trylock()
but after fcopy_respont_to_host() here?
Jason,
Thanks for pointing this out!
IMO we can resolve this by adding down_trylock() in fcopy_release().
What's your opinion?


Looks better and need to cancel the timeout also here?


> + if (down_trylock(&fcopy_transaction.read_sema))
> + pr_debug("can not acquire the semaphore: it is benign\n");
typo
> +
> }
Sorry -- what typo do you mean?

s/benign/begin/ ?

Thanks,
-- Dexuan
ïNrybXÇv^)Þ{.n+{zXÜ}Æz&j:+vzZ++zfh~izw?&)ßf^jÇym@Aa 0hi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/