Re: [PATCH 4/8] xen/blkfront: don't trust the backend response data blindly

From: Juergen Gross
Date: Mon May 17 2021 - 10:28:59 EST


On 17.05.21 16:11, Jan Beulich wrote:
On 13.05.2021 12:02, Juergen Gross wrote:
@@ -1574,10 +1580,16 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id)
spin_lock_irqsave(&rinfo->ring_lock, flags);
again:
rp = rinfo->ring.sring->rsp_prod;
+ if (RING_RESPONSE_PROD_OVERFLOW(&rinfo->ring, rp)) {
+ pr_alert("%s: illegal number of responses %u\n",
+ info->gd->disk_name, rp - rinfo->ring.rsp_cons);
+ goto err;
+ }
rmb(); /* Ensure we see queued responses up to 'rp'. */

I think you want to insert after the barrier.

Why? The relevant variable which is checked is "rp". The result of the
check is in no way depending on the responses themselves. And any change
of rsp_cons is protected by ring_lock, so there is no possibility of
reading an old value here.


@@ -1680,6 +1707,11 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id)
spin_unlock_irqrestore(&rinfo->ring_lock, flags);
return IRQ_HANDLED;
+
+ err:
+ info->connected = BLKIF_STATE_ERROR;
+ pr_alert("%s disabled for further use\n", info->gd->disk_name);
+ return IRQ_HANDLED;
}

Am I understanding that a suspend (and then resume) can be used to
recover from error state? If so - is this intentional? If so in turn,
would it make sense to spell this out in the description?

I'd call it a nice side effect rather than intention. I can add a remark
to the commit message if you want.


Juergen

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature
Description: OpenPGP digital signature