RE: [PATCH 02/10] Drivers: hv: utils: run polling callback always in interrupt context

From: KY Srinivasan
Date: Mon Oct 12 2015 - 02:05:39 EST




> -----Original Message-----
> From: Olaf Hering [mailto:olaf@xxxxxxxxx]
> Sent: Friday, October 9, 2015 4:29 AM
> To: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
> Cc: KY Srinivasan <kys@xxxxxxxxxxxxx>; gregkh@xxxxxxxxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx; devel@xxxxxxxxxxxxxxxxxxxxxx; apw@xxxxxxxxxxxxx;
> jasowang@xxxxxxxxxx
> Subject: Re: [PATCH 02/10] Drivers: hv: utils: run polling callback always in
> interrupt context
>
> On Fri, Oct 09, Vitaly Kuznetsov wrote:
>
> > Olaf Hering <olaf@xxxxxxxxx> writes:
> >
> > > On Thu, Oct 08, KY Srinivasan wrote:
> > >
> > >> > yes, but after doing fcopy_respond_to_host(). I'd suggest we leave
> the
> > >> > check in place, better safe than sorry.
> > >>
> > >> Agreed; Olaf, if it is ok with you, I can fix it up and send.
> > >
> > > I will retest with this part reverted. I think without two code paths
> > > entering hv_fcopy_callback it should be ok to leave this check in.
> >
> > I think hv_fcopy_callback() is not involved here: we call fcopy_on_msg()
> > every time userspace daemon writes to the device and it is not anyhow
> > synchronized with host-guest communication.
>
> An earlier variant of this patch used locks around the vmbus_recvpacket
> and the result was used to decide which thread of execution notifies the
> daemon. I think if the interrupt ran earlier than the daemon did the
> write then the state expected in fcopy_on_msg would obviously be wrong.
> As a result the daemon will just terminate with EFAULT. With the check
> removed it would proceed, and either not chancel the timeout or
> vmbus_recvpacket reads nothing.
>
> But now that it is single threaded the state in fcopy_on_msg should be
> as expected. As said, will retest. Either later today or on Monday.

The only case I can see is if the daemon does not respond in a timely fashion.
In this case, we would timeout and terminate the transaction prematurely.
In this case when the daemon ultimately responds, we would view that response as
an error and return EINVAL and I think in this case it is fine to terminate
the daemon; especially now that we are going to increase the timeout value
to 30 seconds.

Let me know the results of your testing. I will repost the patches after I hear from
you.

Regards,

K. Y