Re: [PATCH 3/7] soc: fsl: qe: avoid ppc-specific io accessors

From: Rasmus Villemoes
Date: Wed Oct 23 2019 - 03:08:35 EST


On 22/10/2019 17.01, Christophe Leroy wrote:
>
>
> On 10/18/2019 12:52 PM, Rasmus Villemoes wrote:
>> In preparation for allowing to build QE support for architectures
>> other than PPC, replace the ppc-specific io accessors. Done via
>>
>
> This patch is not transparent in terms of performance, functions get
> changed significantly.
>
> Before the patch:
>
> 00000330 <ucc_fast_enable>:
> Â330:ÂÂÂ 81 43 00 04ÂÂÂÂ lwzÂÂÂÂ r10,4(r3)
> Â334:ÂÂÂ 7c 00 04 acÂÂÂÂ hwsync
> Â338:ÂÂÂ 81 2a 00 00ÂÂÂÂ lwzÂÂÂÂ r9,0(r10)
> Â33c:ÂÂÂ 0c 09 00 00ÂÂÂÂ twiÂÂÂÂ 0,r9,0
> Â340:ÂÂÂ 4c 00 01 2cÂÂÂÂ isync
> Â344:ÂÂÂ 70 88 00 02ÂÂÂÂ andi.ÂÂ r8,r4,2
> Â348:ÂÂÂ 41 82 00 10ÂÂÂÂ beqÂÂÂÂ 358 <ucc_fast_enable+0x28>
> Â34c:ÂÂÂ 39 00 00 01ÂÂÂÂ liÂÂÂÂÂ r8,1
> Â350:ÂÂÂ 91 03 00 10ÂÂÂÂ stwÂÂÂÂ r8,16(r3)
> Â354:ÂÂÂ 61 29 00 10ÂÂÂÂ oriÂÂÂÂ r9,r9,16
> Â358:ÂÂÂ 70 88 00 01ÂÂÂÂ andi.ÂÂ r8,r4,1
> Â35c:ÂÂÂ 41 82 00 10ÂÂÂÂ beqÂÂÂÂ 36c <ucc_fast_enable+0x3c>
> Â360:ÂÂÂ 39 00 00 01ÂÂÂÂ liÂÂÂÂÂ r8,1
> Â364:ÂÂÂ 91 03 00 14ÂÂÂÂ stwÂÂÂÂ r8,20(r3)
> Â368:ÂÂÂ 61 29 00 20ÂÂÂÂ oriÂÂÂÂ r9,r9,32
> Â36c:ÂÂÂ 7c 00 04 acÂÂÂÂ hwsync
> Â370:ÂÂÂ 91 2a 00 00ÂÂÂÂ stwÂÂÂÂ r9,0(r10)
> Â374:ÂÂÂ 4e 80 00 20ÂÂÂÂ blr
>
> After the patch:
>
> 0000030c <ucc_fast_enable>:
> Â30c:ÂÂÂ 94 21 ff e0ÂÂÂÂ stwuÂÂÂ r1,-32(r1)
> Â310:ÂÂÂ 7c 08 02 a6ÂÂÂÂ mflrÂÂÂ r0
> Â314:ÂÂÂ bf a1 00 14ÂÂÂÂ stmwÂÂÂ r29,20(r1)
> Â318:ÂÂÂ 7c 9f 23 78ÂÂÂÂ mrÂÂÂÂÂ r31,r4
> Â31c:ÂÂÂ 90 01 00 24ÂÂÂÂ stwÂÂÂÂ r0,36(r1)
> Â320:ÂÂÂ 7c 7e 1b 78ÂÂÂÂ mrÂÂÂÂÂ r30,r3
> Â324:ÂÂÂ 83 a3 00 04ÂÂÂÂ lwzÂÂÂÂ r29,4(r3)
> Â328:ÂÂÂ 7f a3 eb 78ÂÂÂÂ mrÂÂÂÂÂ r3,r29
> Â32c:ÂÂÂ 48 00 00 01ÂÂÂÂ blÂÂÂÂÂ 32c <ucc_fast_enable+0x20>
> ÂÂÂÂÂÂÂÂÂÂÂ 32c: R_PPC_REL24ÂÂÂ ioread32be
> Â330:ÂÂÂ 73 e9 00 02ÂÂÂÂ andi.ÂÂ r9,r31,2
> Â334:ÂÂÂ 41 82 00 10ÂÂÂÂ beqÂÂÂÂ 344 <ucc_fast_enable+0x38>
> Â338:ÂÂÂ 39 20 00 01ÂÂÂÂ liÂÂÂÂÂ r9,1
> Â33c:ÂÂÂ 91 3e 00 10ÂÂÂÂ stwÂÂÂÂ r9,16(r30)
> Â340:ÂÂÂ 60 63 00 10ÂÂÂÂ oriÂÂÂÂ r3,r3,16
> Â344:ÂÂÂ 73 e9 00 01ÂÂÂÂ andi.ÂÂ r9,r31,1
> Â348:ÂÂÂ 41 82 00 10ÂÂÂÂ beqÂÂÂÂ 358 <ucc_fast_enable+0x4c>
> Â34c:ÂÂÂ 39 20 00 01ÂÂÂÂ liÂÂÂÂÂ r9,1
> Â350:ÂÂÂ 91 3e 00 14ÂÂÂÂ stwÂÂÂÂ r9,20(r30)
> Â354:ÂÂÂ 60 63 00 20ÂÂÂÂ oriÂÂÂÂ r3,r3,32
> Â358:ÂÂÂ 80 01 00 24ÂÂÂÂ lwzÂÂÂÂ r0,36(r1)
> Â35c:ÂÂÂ 7f a4 eb 78ÂÂÂÂ mrÂÂÂÂÂ r4,r29
> Â360:ÂÂÂ bb a1 00 14ÂÂÂÂ lmwÂÂÂÂ r29,20(r1)
> Â364:ÂÂÂ 7c 08 03 a6ÂÂÂÂ mtlrÂÂÂ r0
> Â368:ÂÂÂ 38 21 00 20ÂÂÂÂ addiÂÂÂ r1,r1,32
> Â36c:ÂÂÂ 48 00 00 00ÂÂÂÂ bÂÂÂÂÂÂ 36c <ucc_fast_enable+0x60>
> ÂÂÂÂÂÂÂÂÂÂÂ 36c: R_PPC_REL24ÂÂÂ iowrite32be

True. Do you know why powerpc uses out-of-line versions of these
accessors when !PPC_INDIRECT_PIO, i.e. at least all of PPC32? It's quite
a bit beyond the scope of this series, but I'd expect moving most if not
all of arch/powerpc/kernel/iomap.c into asm/io.h (guarded by
!defined(CONFIG_PPC_INDIRECT_PIO) of course) as static inlines would
benefit all ppc32 users of iowrite32 and friends.

Is there some other primitive available that (a) is defined on all
architectures (or at least both ppc and arm) and (b) expands to good
code in both/all cases?

Note that a few uses of the the iowrite32be accessors has already
appeared in the qe code with the introduction of the qe_clrsetbits()
helpers in bb8b2062af.

Rasmus