Re: [PATCH] irqchip: riscv: Order normal writes and IPI writes
From: Thomas Gleixner
Date: Mon Jan 20 2025 - 02:37:56 EST
On Fri, Jan 17 2025 at 21:23, Anup Patel wrote:
> On Fri, Jan 17, 2025 at 4:05 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>> On Thu, Jan 16 2025 at 13:09, Charlie Jenkins wrote:
>> > On Thu, Jan 16, 2025 at 08:07:10PM +0800, Xu Lu wrote:
>> >> Replace writel_relaxed() with writel() when issuing IPI to ensure all
>> >> previous write operations made by current CPU are visible to other CPUs.
>> >
>> > Did you experience an ordering issue from this?
>>
>> That's not the right question.
>>
>> CPU 0 CPU 1
>> store A // data
>> store B // IPI
>> IPI handler
>> load A
>>
>> The real question is whether the RISC-V memory model guarantees under
>> all circumstances that A is globally visible before the IPI handler
>> load. If so, then the writel_relaxed() is fine. If not, the fence is
>> required.
>>
>> That's not a question of observation. It's a question of facts defined
>> by the architecture. People have wasted months to analyze such fails
>> which tend to happen once every blue moon with no other trace than
>> "something went wrong" ....
>
> The RISC-V FENCE instruction distinguishes between normal
> memory operations and I/O operations in its predecessor and
> successor sets where r = normal read, w = normal write,
> i = I/O read, and o = I/O write.
>
> The ipi_mux_send_mask() does smp_mb__after_atomic() (equals
> to "fence rw,rw") before calling imsic_ipi_send(). This prevents
> ordering of normal memory writes in imsic_ipi_send() before
> smp_mb__after_atomic() in ipi_mux_send_mask() but it does
> not prevent I/O memory writes in imsic_ipi_send() to be ordered
> before smp_mb__after_atomic().
>
> This means currently nothing prevents the I/O memory write in
> imsic_ipi_send() to be ordered before normal memory writes in
> ipi_mux_send_mask() hence there is a very unlikely possibility
> of an IPI handler on the target CPU seeing incorrect data.
Very unlikely is a valid assumption for a single system, but at scale it
becomes very likely :)
> The conversion of writel_relaxed() to writel() in imsic_ipi_send()
> adds a "fence w,o" before the actual I/O memory write which
> ensures that I/O memory write is not ordered before normal
> memory writes.
>
> Based on the above, the conversion of writel_relaxed() to
> writel() in imsic_ipi_send() looks good to me.
Can we please have something like the above in the change log so this is
documented for posterity?
Thanks
tglx