On Wed, Oct 04, 2023 at 12:14:10AM +1000, Greg Ungerer wrote:
On 3/10/23 06:07, Matthew Wilcox wrote:
00000918 <folio_unlock>:
918: 206f 0004 moveal %sp@(4),%a0
91c: 7001 moveq #1,%d0
91e: b190 eorl %d0,%a0@
920: 2010 movel %a0@,%d0
922: 4a00 tstb %d0
924: 6a0a bpls 930 <folio_unlock+0x18>
926: 42a7 clrl %sp@-
928: 2f08 movel %a0,%sp@-
92a: 4eba fafa jsr %pc@(426 <folio_wake_bit>)
92e: 508f addql #8,%sp
930: 4e75 rts
fwiw, here's what folio_unlock looks like today without any of my
patches:
00000746 <folio_unlock>:
746: 206f 0004 moveal %sp@(4),%a0
74a: 43e8 0003 lea %a0@(3),%a1
74e: 0891 0000 bclr #0,%a1@
752: 2010 movel %a0@,%d0
754: 4a00 tstb %d0
756: 6a0a bpls 762 <folio_unlock+0x1c>
758: 42a7 clrl %sp@-
75a: 2f08 movel %a0,%sp@-
75c: 4eba fcc8 jsr %pc@(426 <folio_wake_bit>)
760: 508f addql #8,%sp
762: 4e75 rts
Same number of instructions, but today's code has slightly longer insns,
so I'm tempted to take the win?
We could use eori instead of eorl, at least according to table 3-9 on
page 3-8:
EOR Dy,<ea>x L Source ^ Destination → Destination ISA_A
EORI #<data>,Dx L Immediate Data ^ Destination → Destination ISA_A
Oh. I misread. It only does EORI to a data register; it can't do EORI
to an address.
400413e6 <folio_unlock>:
400413e6: 206f 0004 moveal %sp@(4),%a0
400413ea: 2010 movel %a0@,%d0
400413ec: 0a80 0000 0001 eoril #1,%d0
400413f2: 2080 movel %d0,%a0@
400413f4: 2010 movel %a0@,%d0
400413f6: 4a00 tstb %d0
400413f8: 6c0a bges 40041404 <folio_unlock+0x1e>
400413fa: 42a7 clrl %sp@-
400413fc: 2f08 movel %a0,%sp@-
400413fe: 4eba ff30 jsr %pc@(40041330 <folio_wake_bit>)
40041402: 508f addql #8,%sp
40041404: 4e75 rts
But that is still worse anyway.
Yup. Looks like the version I posted actually does the best! I'll
munge that into the patch series and repost. Thanks for your help!