Re: [PATCH] net: 9p: fix possible refcount leak in p9_read_work() and recv_done()

From: Hangyu Hua
Date: Tue Jul 12 2022 - 04:32:03 EST


On 2022/7/12 14:06, asmadeus@xxxxxxxxxxxxx wrote:
> Hangyu Hua wrote on Tue, Jul 12, 2022 at 11:24:36AM +0800:
>> That's a little weird. If you are right, the three return paths of this
>> function are inconsistent with the handling of refcount.
>>
>> static void p9_read_work(struct work_struct *work)
>> {
>> ...
>> if ((m->rreq) && (m->rc.offset == m->rc.capacity)) {
>> p9_debug(P9_DEBUG_TRANS, "got new packet\n");
>> m->rreq->rc.size = m->rc.offset;
>> spin_lock(&m->client->lock);
>> if (m->rreq->status == REQ_STATUS_SENT) {
>> list_del(&m->rreq->req_list);
>> p9_client_cb(m->client, m->rreq, REQ_STATUS_RCVD); <---- [1]
>> } else if (m->rreq->status == REQ_STATUS_FLSHD) {
>> /* Ignore replies associated with a cancelled request. */
>> p9_debug(P9_DEBUG_TRANS,
>> "Ignore replies associated with a cancelled request\n"); <---- [2]
>> } else {
>> spin_unlock(&m->client->lock);
>> p9_debug(P9_DEBUG_ERROR,
>> "Request tag %d errored out while we were reading the reply\n",
>> m->rc.tag);
>> err = -EIO;
>> goto error; <---- [3]
>> }
>> spin_unlock(&m->client->lock);
>> m->rc.sdata = NULL;
>> m->rc.offset = 0;
>> m->rc.capacity = 0;
>> p9_req_put(m->rreq); <---- [4]
>> m->rreq = NULL;
>> }
>> ...
>> error:
>> p9_conn_cancel(m, err); <---- [5]
>> clear_bit(Rworksched, &m->wsched);
>> }
>>
>> There are three return paths here, [1] and [2] and [3].
>> [1]: m->rreq will be put twice in [1] and [4]. And m->rreq will be deleted
>> from the m->req_list in [1].
>>
>> [2]: m->rreq will be put in [4]. And m->rreq will not be deleted from
>> m->req_list.
>
> when req status got put to FLUSHD the req was dropped from the list
> already and put in p9_fd_cancel, so we shouldn't put it here.
>
>> [3]: m->rreq will be put in [5]. And m->rreq will be deleted from the
>> m->req_list in [5].
>
> On this error case I really can't say anything: it depends on how the
> req got in this state in the first place -- more precisely is it still
> in req_list or not?
>
> But even if it is and we leak it here, we return an error here, so the
> connection will be marked as disconnected and won't be usable anymore.
> The memory will be freed when the user umounts after that.
>
> If we took the time to re-init the rreq->req_list everytime we could
> check if it's empty (don't think we can rely on it being poisoned), but
> I just don't think it's worth it: it's better to consume a bit more
> memory until umount than to risk a UAF.
>
> (note: while writing this I noticed p9_tag_cleanup() in
> p9_client_destroy() only tracks requests still in the idr, so doesn't
> work for requests that went through p9_tag_remove().
> We don't need p9_tag_remove() anymore so I've just gotten rid of it and
> we will catch these now)
>
>
>> If p9_tag_lookup keep the refcount of req which is in m->req_list. There
>> will be a double put in return path [1] and a potential UAF in return path
>> [2]. And this also means a req in m->req_list without getting refcount
>> before p9_tag_lookup.
>
> That is the nominal path, we'd notice immediately if there are too many
> puts there.
> A request is initialized with two refs so that we can have one for the
> transport ((a), for fd, "is the request tracked in a list?") and one for
> the main thread ((b), p9_client_rpc which will put it at the end)
> Then you get a third ref from p9_tag_lookup that I was forgetting about,
> (c).
>
> Going through [1] removes it from the list, and removes the associated
> ref (a), then through p9_client_cb which removes ref (c) and wakes up
> p9_client_rpc which takes the last ref (b), freeing the request.
>

I think the normal path is right beacuase p9_tag_lookup and [4] keep the
balance of refcount. This just proves that error path may have refcount
leak. Beacause error path only put refcount once. In general, either [4]
of the normal path is redundant(like you said, this is easy to catch),
or the error path may have a refcount leak.

But you are right, it'll be caught on umount.

>
> Now you were correct on one of these error paths not described in your
> last mail: we -are- missing a p9_req_ut in the "No recv fcall for tag
> %d" error path shortly after p9_tag_lookup, for the ref obtained from
> p9_tag_lookup itself -- feel free to resend a patch with just that one.
> But once again the connection is now unusable and it'll be caught on
> umount so it's not the end of the world...
>
> (I'd appreciate if you base the new patch on top of
> https://github.com/martinetd/linux/commits/9p-next )
>

I see. I will make a new patch later.

>>
>> static void p9_write_work(struct work_struct *work)
>> {
>> ...
>> list_move_tail(&req->req_list, &m->req_list);
>>
>> m->wbuf = req->tc.sdata;
>> m->wsize = req->tc.size;
>> m->wpos = 0;
>> p9_req_get(req);
>> ...
>> }
>>
>> But if you check out p9_write_work, a refcount already get after
>> list_move_tail. We don't need to rely on p9_tag_lookup to keep a list's
>> refcount.
>
> This refcount is because we are keeping a ref in m->wreq, and is freed
> when m->wreq is set back to null when the packet is done writing a few
> lines below (but possibly in another call of the function).
>
> refs don't have to come from p9_tag_lookup, they're managing pointers
> lifecycle: we're making a copy of the pointer, so we should increment
> the refcount so another thread can't free the req under us. In this case
> the p9_req_get() is under the trans fd m->client->lock where we got the
> req from the list, so req can't be freed between its obtention from the
> list and then; once the lock is dropped the req is protected by the ref.
>
>

My fault. I misunderstood here.

Thanks,
Hangyu



>> Whatsmore, code comments in p9_tag_alloc also proves that the
>> refcount get by p9_tag_lookup is a temporary refcount.
>
> comments don't prove anything, but yes I forgot p9_tag_alloc takes a ref
> when I commented earlier, sorry.
>
>>> This one isn't as clear cut, I see that they put the client in a
>>> FLUSHING state but nothing seems to acton on it... But if this happens
>>> we're already in the use after free realm -- it means rc.sdata was
>>> already set so the other thread could be calling p9_client_cb anytime if
>>> it already hasn't, and yet another thread will then do the final ref put
>>> and free this.
>>> We shouldn't free this here as that would also be an overflow. The best
>>> possible thing to do at this point is just to stop using that pointer.
>>>
>>
>> But p9_tag_lookup have a lock inside. Doesn't this mean p9_tag_lookup won't
>> return a freed req? Otherwise we should fix the lock to avoid falling into
>> the use after free realm.
>
> Right, that falls into the p9_tag_lookup ref, I had implemented this
> better than I thought I did...
>
> I agree that one is also more correct to add, although I'd really want
> to make some rdma setup and trigger a few errors to test.
> > --
> Dominique