Re: [PATCH v3 06/18] xen/pvcalls: handle commands from the frontend

From: Boris Ostrovsky
Date: Thu Jun 15 2017 - 10:33:59 EST


On 06/14/2017 05:03 PM, Stefano Stabellini wrote:
> On Mon, 12 Jun 2017, Boris Ostrovsky wrote:
>>> +
>>> static void pvcalls_back_work(struct work_struct *work)
>>> {
>>> + struct pvcalls_fedata *priv = container_of(work,
>>> + struct pvcalls_fedata, register_work);
>>> + int notify, notify_all = 0, more = 1;
>>> + struct xen_pvcalls_request req;
>>> + struct xenbus_device *dev = priv->dev;
>>> +
>>> + while (more) {
>>> + while (RING_HAS_UNCONSUMED_REQUESTS(&priv->ring)) {
>>> + RING_COPY_REQUEST(&priv->ring,
>>> + priv->ring.req_cons++,
>>> + &req);
>>> +
>>> + if (!pvcalls_back_handle_cmd(dev, &req)) {
>>> + RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(
>>> + &priv->ring, notify);
>>> + notify_all += notify;
>>> + }
>>> + }
>>> +
>>> + if (notify_all)
>>> + notify_remote_via_irq(priv->irq);
>>> +
>>> + RING_FINAL_CHECK_FOR_REQUESTS(&priv->ring, more);
>>> + }
>>> }
>>>
>>> static irqreturn_t pvcalls_back_event(int irq, void *dev_id)
>>> {
>>> + struct xenbus_device *dev = dev_id;
>>> + struct pvcalls_fedata *priv = NULL;
>>> +
>>> + if (dev == NULL)
>>> + return IRQ_HANDLED;
>>> +
>>> + priv = dev_get_drvdata(&dev->dev);
>>> + if (priv == NULL)
>>> + return IRQ_HANDLED;
>>> +
>>> + /*
>>> + * TODO: a small theoretical race exists if we try to queue work
>>> + * after pvcalls_back_work checked for final requests and before
>>> + * it returns. The queuing will fail, and pvcalls_back_work
>>> + * won't do the work because it is about to return. In that
>>> + * case, we lose the notification.
>>> + */
>>> + queue_work(priv->wq, &priv->register_work);
>> Would queuing delayed work (if queue_work() failed) help? And canceling
>> it on next invocation of pvcalls_back_event()?
> Looking at the implementation of queue_delayed_work_on and
> queue_work_on, it looks like that if queue_work fails then also
> queue_delayed_work would fail: they both test on
> WORK_STRUCT_PENDING_BIT.

Right, I should have looked at this myself. And flush_work() I suppose
cannot be used here since it may sleep?

Then I also can't think of anything else.

-boris