Re: [patch v8 0/7] handle unexpected message from server

From: yukuai (C)
Date: Fri Oct 08 2021 - 03:17:33 EST


On 2021/09/29 20:54, yukuai (C) wrote:
On 2021/09/23 21:33, yukuai (C) wrote:
On 2021/09/16 17:33, Yu Kuai wrote:

Hi, jens

Any interest to apply this series?

friendly ping ...

Hi, Jens

friendly ping again ...

Thanks,
Kuai
This patch set tries to fix that client might oops if nbd server send
unexpected message to client, for example, our syzkaller report a uaf
in nbd_read_stat():

Call trace:
  dump_backtrace+0x0/0x310 arch/arm64/kernel/time.c:78
  show_stack+0x28/0x38 arch/arm64/kernel/traps.c:158
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x144/0x1b4 lib/dump_stack.c:118
  print_address_description+0x68/0x2d0 mm/kasan/report.c:253
  kasan_report_error mm/kasan/report.c:351 [inline]
  kasan_report+0x134/0x2f0 mm/kasan/report.c:409
  check_memory_region_inline mm/kasan/kasan.c:260 [inline]
  __asan_load4+0x88/0xb0 mm/kasan/kasan.c:699
  __read_once_size include/linux/compiler.h:193 [inline]
  blk_mq_rq_state block/blk-mq.h:106 [inline]
  blk_mq_request_started+0x24/0x40 block/blk-mq.c:644
  nbd_read_stat drivers/block/nbd.c:670 [inline]
  recv_work+0x1bc/0x890 drivers/block/nbd.c:749
  process_one_work+0x3ec/0x9e0 kernel/workqueue.c:2147
  worker_thread+0x80/0x9d0 kernel/workqueue.c:2302
  kthread+0x1d8/0x1e0 kernel/kthread.c:255
  ret_from_fork+0x10/0x18 arch/arm64/kernel/entry.S:1174

1) At first, a normal io is submitted and completed with scheduler:

internel_tag = blk_mq_get_tag -> get tag from sched_tags
  blk_mq_rq_ctx_init
   sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
...
blk_mq_get_driver_tag
  __blk_mq_get_driver_tag -> get tag from tags
  tags->rq[tag] = sched_tag->static_rq[internel_tag]

So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
to the request: sched_tags->static_rq[internal_tag]. Even if the
io is finished.

2) nbd server send a reply with random tag directly:

recv_work
  nbd_read_stat
   blk_mq_tag_to_rq(tags, tag)
    rq = tags->rq[tag]

3) if the sched_tags->static_rq is freed:

blk_mq_sched_free_requests
  blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
   -> step 2) access rq before clearing rq mapping
   blk_mq_clear_rq_mapping(set, tags, hctx_idx);
   __free_pages() -> rq is freed here

4) Then, nbd continue to use the freed request in nbd_read_stat()

Changes in v8:
  - add patch 5 to this series.
  - modify some words.
Changes in v7:
  - instead of exposing blk_queue_exit(), using percpu_ref_put()
  directly.
  - drop the ref right after nbd_handle_reply().
Changes in v6:
  - don't set cmd->status to error if request is completed before
  nbd_clear_req().
  - get 'q_usage_counter' to prevent accessing freed request through
  blk_mq_tag_to_rq(), instead of using blk_mq_find_and_get_req().
Changes in v5:
  - move patch 1 & 2 in v4 (patch 4 & 5 in v5) behind
  - add some comment in patch 5
Changes in v4:
  - change the name of the patchset, since uaf is not the only problem
  if server send unexpected reply message.
  - instead of adding new interface, use blk_mq_find_and_get_req().
  - add patch 5 to this series
Changes in v3:
  - v2 can't fix the problem thoroughly, add patch 3-4 to this series.
  - modify descriptions.
  - patch 5 is just a cleanup
Changes in v2:
  - as Bart suggested, add a new helper function for drivers to get
  request by tag.

Yu Kuai (7):
   nbd: don't handle response without a corresponding request message
   nbd: make sure request completion won't concurrent
   nbd: check sock index in nbd_read_stat()
   nbd: don't start request if nbd_queue_rq() failed
   nbd: clean up return value checking of sock_xmit()
   nbd: partition nbd_read_stat() into nbd_read_reply() and
     nbd_handle_reply()
   nbd: fix uaf in nbd_handle_reply()

  drivers/block/nbd.c | 135 +++++++++++++++++++++++++++++++-------------
  1 file changed, 96 insertions(+), 39 deletions(-)