Re: [PATCH] 9p: fix crash when transaction killed
From: Schspa Shi
Date: Wed Nov 30 2022 - 08:22:43 EST
asmadeus@xxxxxxxxxxxxx writes:
> Schspa Shi wrote on Wed, Nov 30, 2022 at 04:14:32PM +0800:
>> > - reqs are alloced in a kmem_cache created with SLAB_TYPESAFE_BY_RCU.
>> > This means that if we get a req from idr_find, even if it has just been
>> > freed, it either is still in the state it was freed at (hence refcount
>> > 0, we ignore it) or is another req coming from the same cache (if
>>
>> If the req was newly alloced(It was at a new page), refcount maybe not
>> 0, there will be problem in this case. It seems we can't relay on this.
>>
>> We need to set the refcount to zero before add it to idr in p9_tag_alloc.
>
> Hmm, if it's reused then it's zero by definition, but if it's a new
> allocation (uninitialized) then anything goes; that lookup could find
> and increase it before the refcount_set, and we'd have an off by one
> leading to use after free. Good catch!
>
> Initializing it to zero will lead to the client busy-looping until after
> the refcount is properly set, which should work.
Why? It looks no different from the previous process here. Initializing
it to zero should makes no difference.
> Setting refcount early might have us use an re-used req before the tag
> has been changed so that one cannot move.
>
> Could you test with just that changed if syzbot still reproduces this
> bug? (perhaps add a comment if you send this)
>
I have upload a new v2 change for this. But I can't easily reproduce
this problem.
> ------
> diff --git a/net/9p/client.c b/net/9p/client.c
> index aaa37b07e30a..aa64724f6a69 100644
> --- a/net/9p/client.c
> +++ b/net/9p/client.c
> @@ -297,6 +297,7 @@ p9_tag_alloc(struct p9_client *c, int8_t type, uint t_size, uint r_size,
> p9pdu_reset(&req->rc);
> req->t_err = 0;
> req->status = REQ_STATUS_ALLOC;
> + refcount_set(&req->refcount, 0);
> init_waitqueue_head(&req->wq);
> INIT_LIST_HEAD(&req->req_list);
>
> -----
>
>> > refcount isn't zero, we can check its tag)
>>
>> As for the release case, the next request will have the same tag with
>> high probability. It's better to make the tag value to be an increase
>> sequence, thus will avoid very much possible req reuse.
>
> I'd love to be able to do this, but it would break some servers that
> assume tags are small (e.g. using it as an index for a tag array)
> ... I thought nfs-ganesha was doing this but they properly put in in
> buckets, so that's one less server to worry about, but I wouldn't put
> it past some simple servers to do that; having a way to lookup a given
> tag for flush is an implementation requirement.
>
> That shouldn't be a problem though as that will just lead to either fail
> the guard check after lookup (m->rreq->status != REQ_STATUS_SENT) or be
> processed as a normal reply if it's already been sent by the other
> thread at this point.
> OTOH, that m->rreq->status isn't protected by m->req_lock in trans_fd,
> and that is probably another bug...
Yes, I aggree with you, another BUG.
--
BRs
Schspa Shi