Re: [PATCH 1/2] asm-generic/io: Pass result on inX() accessor to __io_par()

From: Palmer Dabbelt
Date: Wed Feb 13 2019 - 16:57:56 EST

On Wed, 13 Feb 2019 12:59:28 PST (-0800), Arnd Bergmann wrote:
On Wed, Feb 13, 2019 at 6:46 PM Will Deacon <will.deacon@xxxxxxx> wrote:

On Tue, Feb 12, 2019 at 12:55:17PM +0100, Arnd Bergmann wrote:

> For all I can see, this should not conflict with the usage of the
> same macros on RISC-V, though it does make add a significant
> difference, so I'd like to see an Ack from the RISC-V folks as
> well (added to Cc), or possibly a change to arch/riscv/include/asm/io.h
> to do a corresponding change.

Thanks, the original patches didn't make it through my filters.

There's already a comment in that header which says that the accesses are
ordered wrt timer reads, so I don't think anything needs to change there.
For consistency with the macro arguments, I could augment their __io_par to
take the read value as an unused argument, if that's what you mean?

FWIW, we don't really have a way to mandate this in the ISA yet as there's no formal model for either CSR orderings or the IO memory space.
Yes, that's what I meant, I should have been clearer there.

That sounds reasonable to me. It looks like we can also go ahead and delete a bunch of arch/riscv/include/asm/io.h now that this stuff is in asm-generic, which would cause us to actually start using these things. I didn't know this had all been moved to asm-generic otherwise I would have cleaned this up earlier.

I think this should do it, but this does bring up a bit of an issue: the RISC-V versions of reads and friends put barriers outside the loop, while the asm-generic version don't. What are these actually supposed to do?

Either way that resolves, feel free to consider something like

diff --git a/arch/riscv/include/asm/io.h b/arch/riscv/include/asm/io.h
index b269451e7e85..378975f180a7 100644
--- a/arch/riscv/include/asm/io.h
+++ b/arch/riscv/include/asm/io.h
@@ -198,20 +198,20 @@ static inline u64 __raw_readq(const volatile void __iomem *addr)
* writes.
#define __io_pbr() __asm__ __volatile__ ("fence io,i" : : : "memory");
-#define __io_par() __asm__ __volatile__ ("fence i,ior" : : : "memory");
+#define __io_par(v) __asm__ __volatile__ ("fence i,ior" : : : "memory");
#define __io_pbw() __asm__ __volatile__ ("fence iow,o" : : : "memory");
#define __io_paw() __asm__ __volatile__ ("fence o,io" : : : "memory");

-#define inb(c) ({ u8 __v; __io_pbr(); __v = readb_cpu((void*)(PCI_IOBASE + (c))); __io_par(); __v; })
-#define inw(c) ({ u16 __v; __io_pbr(); __v = readw_cpu((void*)(PCI_IOBASE + (c))); __io_par(); __v; })
-#define inl(c) ({ u32 __v; __io_pbr(); __v = readl_cpu((void*)(PCI_IOBASE + (c))); __io_par(); __v; })
+#define inb(c) ({ u8 __v; __io_pbr(); __v = readb_cpu((void*)(PCI_IOBASE + (c))); __io_par(__v); __v; })
+#define inw(c) ({ u16 __v; __io_pbr(); __v = readw_cpu((void*)(PCI_IOBASE + (c))); __io_par(__v); __v; })
+#define inl(c) ({ u32 __v; __io_pbr(); __v = readl_cpu((void*)(PCI_IOBASE + (c))); __io_par(__v); __v; })

#define outb(v,c) ({ __io_pbw(); writeb_cpu((v),(void*)(PCI_IOBASE + (c))); __io_paw(); })
#define outw(v,c) ({ __io_pbw(); writew_cpu((v),(void*)(PCI_IOBASE + (c))); __io_paw(); })
#define outl(v,c) ({ __io_pbw(); writel_cpu((v),(void*)(PCI_IOBASE + (c))); __io_paw(); })

#ifdef CONFIG_64BIT
-#define inq(c) ({ u64 __v; __io_pbr(); __v = readq_cpu((void*)(c)); __io_par(); __v; })
+#define inq(c) ({ u64 __v; __io_pbr(); __v = readq_cpu((void*)(c)); __io_par(__v); __v; })
#define outq(v,c) ({ __io_pbw(); writeq_cpu((v),(void*)(c)); __io_paw(); })

@@ -261,9 +261,9 @@ __io_reads_ins(reads, u32, l, __io_br(), __io_ar())
#define readsw(addr, buffer, count) __readsw(addr, buffer, count)
#define readsl(addr, buffer, count) __readsl(addr, buffer, count)

-__io_reads_ins(ins, u8, b, __io_pbr(), __io_par())
-__io_reads_ins(ins, u16, w, __io_pbr(), __io_par())
-__io_reads_ins(ins, u32, l, __io_pbr(), __io_par())
+__io_reads_ins(ins, u8, b, __io_pbr(), __io_par(addr))
+__io_reads_ins(ins, u16, w, __io_pbr(), __io_par(addr))
+__io_reads_ins(ins, u32, l, __io_pbr(), __io_par(addr))
#define insb(addr, buffer, count) __insb((void __iomem *)(long)addr, buffer, count)
#define insw(addr, buffer, count) __insw((void __iomem *)(long)addr, buffer, count)
#define insl(addr, buffer, count) __insl((void __iomem *)(long)addr, buffer, count)


Revewied-by: Palmer Dabbelt <palmer@xxxxxxxxxx>

when included along with the other diff. That way we can at least keep the macro signatures matching, the cleanup can come later...