Re: [PATCH v2] 9p/trans_fd: mark concurrent read and writes to p9_conn->err

From: Ignacio Encinas Rubio
Date: Tue Mar 18 2025 - 17:22:06 EST


Trimming CC to avoid spamming people (I hope that's ok)

Hello Dominique!

On 17/3/25 18:01, Ignacio Encinas Rubio wrote:
> On 16/3/25 22:24, Dominique Martinet wrote:
>> There's this access out of the lock so perhaps this should look like
>> this instead (with or without the READ_ONCE)
>>
>> + err = READ_ONCE(m->err);
>> + if (err < 0) {
>> spin_unlock(&m->req_lock);
>> - return m->err;
>> + return err;
>
> Oops, this is embarrassing... Thanks for catching it.
>
>> Anyway, m->err is only written exactly once so it doesn't matter the
>> least in practice,
>
> I think this one deserves a fix, I disagree :)
>
>> and it looks like gcc generates exactly the same
>> thing (... even if I make that `return READ_ONCE(m->err)` which
>> surprises me a bit..), so this is just yak shaving.
>
> This is weird... I'll double check because it shouldn't generate the
> same code as far as I know.

I had a bit of time to check this. I understood you said that (A)

err = READ_ONCE(m->err);
if (err < 0) {
spin_unlock(&m->req_lock);
return READ_ONCE(m->err);
}

compiles to the same thing as (B)

err = READ_ONCE(m->err);
if (err < 0) {
spin_unlock(&m->req_lock);
return err;
}

if you didn't say this, just ignore this email :). With gcc (GCC)
14.2.1 20250110 (Red Hat 14.2.1-7) I'm seeing a difference:

``` (A)
movl 40(%rbx), %eax # MEM[(const volatile int *)ts_13 + 40B], _14
# net/9p/trans_fd.c:679: if (err < 0) {
testl %eax, %eax # _14
js .L323 #,

[...]

.L323:
# ./include/linux/spinlock.h:391: raw_spin_unlock(&lock->rlock);
movq %r12, %rdi # _21,
call _raw_spin_unlock #
# net/9p/trans_fd.c:681: return READ_ONCE(m->err);
movl 40(%rbx), %eax # MEM[(const volatile int *)ts_13 + 40B], <retval>
# net/9p/trans_fd.c:697: }
popq %rbx #
popq %rbp #
popq %r12 #
jmp __x86_return_thunk
```

``` (B)
movl 40(%rbx), %r12d # MEM[(const volatile int *)ts_13 + 40B], <retval>
# net/9p/trans_fd.c:679: if (err < 0) {
testl %r12d, %r12d # <retval>
js .L323 #,

[...]

.L323:
# ./include/linux/spinlock.h:391: raw_spin_unlock(&lock->rlock);
movq %r13, %rdi # _20,
call _raw_spin_unlock #
# net/9p/trans_fd.c:697: }
movl %r12d, %eax # <retval>,
popq %rbx #
popq %rbp #
popq %r12 #
popq %r13 #
jmp __x86_return_thunk
```

(A) performs another memory read after the spinlock has been unlocked
while (B) reuses the value from the register. If you're using an old GCC
it might have bugs. I can't recall where exactly but I have seen links
to GCC bugs regarding this issues somewhere (LWN posts or kernel docs?)

To get the assembly I just got the command from .trans_fd.o.cmd and
added "-S -fverbose-asm" (I can't really read x86 assembly)