Re: [PATCH V1] block: Fix use-after-free while iterating over requests

From: Hannes Reinecke
Date: Mon Nov 30 2020 - 02:09:27 EST


On 11/26/20 5:49 PM, John Garry wrote:
On 26/11/2020 16:27, Bart Van Assche wrote:
On 11/26/20 7:02 AM, Pradeep P V K wrote:
Observes below crash while accessing (use-after-free) request queue
member of struct request.

191.784789:   <2> Unable to handle kernel paging request at virtual
address ffffff81429a4440
...
191.786174:   <2> CPU: 3 PID: 213 Comm: kworker/3:1H Tainted: G S
O      5.4.61-qgki-debug-ge45de39 #1
...
191.786226:   <2> Workqueue: kblockd blk_mq_timeout_work
191.786242:   <2> pstate: 20c00005 (nzCv daif +PAN +UAO)
191.786261:   <2> pc : bt_for_each+0x114/0x1a4
191.786274:   <2> lr : bt_for_each+0xe0/0x1a4
...
191.786494:   <2> Call trace:
191.786507:   <2>  bt_for_each+0x114/0x1a4
191.786519:   <2>  blk_mq_queue_tag_busy_iter+0x60/0xd4
191.786532:   <2>  blk_mq_timeout_work+0x54/0xe8
191.786549:   <2>  process_one_work+0x2cc/0x568
191.786562:   <2>  worker_thread+0x28c/0x518
191.786577:   <2>  kthread+0x160/0x170
191.786594:   <2>  ret_from_fork+0x10/0x18
191.786615:   <2> Code: 0b080148 f9404929 f8685921 b4fffe01 (f9400028)
191.786630:   <2> ---[ end trace 0f1f51d79ab3f955 ]---
191.786643:   <2> Kernel panic - not syncing: Fatal exception

Fix this by updating the freed request with NULL.
This could avoid accessing the already free request from other
contexts while iterating over the requests.

Signed-off-by: Pradeep P V K <ppvk@xxxxxxxxxxxxxx>
---
  block/blk-mq.c | 1 +
  block/blk-mq.h | 1 +
  2 files changed, 2 insertions(+)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 55bcee5..9996cb1 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -492,6 +492,7 @@ static void __blk_mq_free_request(struct request *rq)
      blk_crypto_free_request(rq);
      blk_pm_mark_last_busy(rq);
+    hctx->tags->rqs[rq->tag] = NULL;
      rq->mq_hctx = NULL;
      if (rq->tag != BLK_MQ_NO_TAG)
          blk_mq_put_tag(hctx->tags, ctx, rq->tag);
diff --git a/block/blk-mq.h b/block/blk-mq.h
index a52703c..8747bf1 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -224,6 +224,7 @@ static inline int __blk_mq_active_requests(struct blk_mq_hw_ctx *hctx)
  static inline void __blk_mq_put_driver_tag(struct blk_mq_hw_ctx *hctx,
                         struct request *rq)
  {
+    hctx->tags->rqs[rq->tag] = NULL;
      blk_mq_put_tag(hctx->tags, rq->mq_ctx, rq->tag);
      rq->tag = BLK_MQ_NO_TAG;

Is this perhaps a block driver bug instead of a block layer core bug? If
this would be a block layer core bug, it would have been reported before.

Isn't this the same issue which as been reported many times:

https://lore.kernel.org/linux-block/20200820180335.3109216-1-ming.lei@xxxxxxxxxx/

https://lore.kernel.org/linux-block/8376443a-ec1b-0cef-8244-ed584b96fa96@xxxxxxxxxx/

But I never saw a crash, just kasan report.

And if that above were a concern, I would have thought one would need to use a WRITE_ONCE() here; otherwise we might have a race condition where other CPUs still see the old value, no?

Cheers,

Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@xxxxxxx +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer