Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()

From: Ming Lei
Date: Tue Sep 14 2021 - 10:33:41 EST


On Tue, Sep 14, 2021 at 05:08:00PM +0800, yukuai (C) wrote:
> On 2021/09/14 15:46, Ming Lei wrote:
> > On Tue, Sep 14, 2021 at 03:13:38PM +0800, yukuai (C) wrote:
> > > On 2021/09/14 14:44, Ming Lei wrote:
> > > > On Tue, Sep 14, 2021 at 11:11:06AM +0800, yukuai (C) wrote:
> > > > > On 2021/09/14 9:11, Ming Lei wrote:
> > > > > > On Thu, Sep 09, 2021 at 10:12:55PM +0800, Yu Kuai wrote:
> > > > > > > blk_mq_tag_to_rq() can only ensure to return valid request in
> > > > > > > following situation:
> > > > > > >
> > > > > > > 1) client send request message to server first
> > > > > > > submit_bio
> > > > > > > ...
> > > > > > > blk_mq_get_tag
> > > > > > > ...
> > > > > > > blk_mq_get_driver_tag
> > > > > > > ...
> > > > > > > nbd_queue_rq
> > > > > > > nbd_handle_cmd
> > > > > > > nbd_send_cmd
> > > > > > >
> > > > > > > 2) client receive respond message from server
> > > > > > > recv_work
> > > > > > > nbd_read_stat
> > > > > > > blk_mq_tag_to_rq
> > > > > > >
> > > > > > > If step 1) is missing, blk_mq_tag_to_rq() will return a stale
> > > > > > > request, which might be freed. Thus convert to use
> > > > > > > blk_mq_find_and_get_req() to make sure the returned request is not
> > > > > > > freed.
> > > > > >
> > > > > > But NBD_CMD_INFLIGHT has been added for checking if the reply is
> > > > > > expected, do we still need blk_mq_find_and_get_req() for covering
> > > > > > this issue? BTW, request and its payload is pre-allocated, so there
> > > > > > isn't real use-after-free.
> > > > >
> > > > > Hi, Ming
> > > > >
> > > > > Checking NBD_CMD_INFLIGHT relied on the request founded by tag is valid,
> > > > > not the other way round.
> > > > >
> > > > > nbd_read_stat
> > > > > req = blk_mq_tag_to_rq()
> > > > > cmd = blk_mq_rq_to_pdu(req)
> > > > > mutex_lock(cmd->lock)
> > > > > checking NBD_CMD_INFLIGHT
> > > >
> > > > Request and its payload is pre-allocated, and either req->ref or cmd->lock can
> > > > serve the same purpose here. Once cmd->lock is held, you can check if the cmd is
> > > > inflight or not. If it isn't inflight, just return -ENOENT. Is there any
> > > > problem to handle in this way?
> > >
> > > Hi, Ming
> > >
> > > in nbd_read_stat:
> > >
> > > 1) get a request by tag first
> > > 2) get nbd_cmd by the request
> > > 3) hold cmd->lock and check if cmd is inflight
> > >
> > > If we want to check if the cmd is inflight in step 3), we have to do
> > > setp 1) and 2) first. As I explained in patch 0, blk_mq_tag_to_rq()
> > > can't make sure the returned request is not freed:
> > >
> > > nbd_read_stat
> > > blk_mq_sched_free_requests
> > > blk_mq_free_rqs
> > > blk_mq_tag_to_rq
> > > -> get rq before clear mapping
> > > blk_mq_clear_rq_mapping
> > > __free_pages -> rq is freed
> > > blk_mq_request_started -> UAF
> >
> > If the above can happen, blk_mq_find_and_get_req() may not fix it too, just
>
> Hi, Ming
>
> Why can't blk_mq_find_and_get_req() fix it? I can't think of any
> scenario that might have problem currently.

The principle behind blk_mq_find_and_get_req() is that if one request's
ref is grabbed, the queue's usage counter is guaranteed to be grabbed,
and this way isn't straight-forward.

Yeah, it can fix the issue, but I don't think it is good to call it in
fast path cause tags->lock is required.

>
> > wondering why not take the following simpler way for avoiding the UAF?
> >
> > diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> > index 5170a630778d..dfa5cce71f66 100644
> > --- a/drivers/block/nbd.c
> > +++ b/drivers/block/nbd.c
> > @@ -795,9 +795,13 @@ static void recv_work(struct work_struct *work)
> > work);
> > struct nbd_device *nbd = args->nbd;
> > struct nbd_config *config = nbd->config;
> > + struct request_queue *q = nbd->disk->queue;
> > struct nbd_cmd *cmd;
> > struct request *rq;
> > + if (!percpu_ref_tryget(&q->q_usage_counter))
> > + return;
> > +
>
> We can't make sure freeze_queue is called before this, thus this approch
> can't fix the problem, right?
> nbd_read_stat
> blk_mq_tag_to_rq
> elevator_switch
> blk_mq_freeze_queue(q);
> elevator_switch_mq
> elevator_exit
> blk_mq_sched_free_requests
> blk_mq_request_started -> UAF

No, blk_mq_freeze_queue() waits until .q_usage_counter becomes zero, so
there won't be any concurrent nbd_read_stat() during switching elevator
if ->q_usage_counter is grabbed in recv_work().

Thanks,
Ming