Re: [PATCH] 9p/trans_fd: mark concurrent read and writes to p9_conn->err

From: Dominique Martinet
Date: Sat Mar 08 2025 - 17:18:11 EST


Ignacio Encinas wrote on Sat, Mar 08, 2025 at 06:47:38PM +0100:
> Writes for the error value of a connection are spinlock-protected inside
> p9_conn_cancel, but lockless reads are present elsewhere to avoid
> performing unnecessary work after an error has been met.
>
> Mark the write and lockless reads to make KCSAN happy. Mark the write as
> exclusive following the recommendation in "Lock-Protected Writes with
> Lockless Reads" in tools/memory-model/Documentation/access-marking.txt
> while we are at it.

Thank for looking into it, I wasn't aware this could be enough to please
the KCSAN gods.

Unfortunately neither have a repro so will be hard to test but I guess
it can't hurt, so will pick this up after a bit.

> Reported-by: syzbot+d69a7cc8c683c2cb7506@xxxxxxxxxxxxxxxxxxxxxxxxx
> Reported-by: syzbot+483d6c9b9231ea7e1851@xxxxxxxxxxxxxxxxxxxxxxxxx
> Signed-off-by: Ignacio Encinas <ignacio@xxxxxxxxxxxx>
> ---
> Hello! I noticed these syzbot reports that seem to repeat periodically
> and figured I should send a patch.
>
> The read-paths look very similar to the one changed here [1]. Perhaps it
> would make sense to make them the same?

I've just gone over read/write work and I think overall the logic
doesn't look too bad as the checks for m->err are just optimizations
that could be skipped entierly.

For example, even if read work misses the check and recv some data, the
p9_tag_lookup is what actually protects the "req", so either cancel
didn't cancel yet and it'll get two status updates but it's valid
memory and the refcounting is also correct, or the cancel was finished
and read won't find the request.
(I guess one could argue that two status updates could be a problem in
the p9_client_rpc path, but the data actually has been received and the
mount is busted anyway so I don't think any bad bug would happen..
Famous last words, yes)

Write likewise will just find itself with nothing to do as the list had
been emptied (and p9_fd_request does check m->err under lock so can't
add new items)

So, sure, they could recheck but I don't see the point; if syzbot is
happy with this patch I think that's good enough.


> [1] https://lore.kernel.org/all/ZTZtHdqifXlWG8nN@xxxxxxxxxxxxx/
> ---
> net/9p/trans_fd.c | 11 ++++++-----
> 1 file changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
> index 196060dc6138af10e99ad04a76ee36a11f770c65..5458e6530084cabeb01d13e9b9a4b1b8f338e494 100644
> --- a/net/9p/trans_fd.c
> +++ b/net/9p/trans_fd.c
> @@ -194,9 +194,10 @@ static void p9_conn_cancel(struct p9_conn *m, int err)
> if (m->err) {

This is under spin lock and I don't see the compiler reordering this
read and write, but should this also get READ_ONCE?

> spin_unlock(&m->req_lock);
> return;
> }
>
> - m->err = err;
> + WRITE_ONCE(m->err, err);
> + ASSERT_EXCLUSIVE_WRITER(m->err);

Thanks,
--
Dominique Martinet | Asmadeus