Re: [PATCH 0/5] Generic Ticket Spinlocks

From: Palmer Dabbelt
Date: Tue Mar 22 2022 - 16:03:06 EST


On Tue, 22 Mar 2022 11:18:18 PDT (-0700), mail@xxxxxxxxxxx wrote:
On 16/03/2022 23:25, Palmer Dabbelt wrote:
Peter sent an RFC out about a year ago
<https://lore.kernel.org/lkml/YHbBBuVFNnI4kjj3@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/>,
but after a spirited discussion it looks like we lost track of things.
IIRC there was broad consensus on this being the way to go, but there
was a lot of discussion so I wasn't sure. Given that it's been a year,
I figured it'd be best to just send this out again formatted a bit more
explicitly as a patch.

This has had almost no testing (just a build test on RISC-V defconfig),
but I wanted to send it out largely as-is because I didn't have a SOB
from Peter on the code. I had sent around something sort of similar in
spirit, but this looks completely re-written. Just to play it safe I
wanted to send out almost exactly as it was posted. I'd probably rename
this tspinlock and tspinlock_types, as the mis-match kind of makes my
eyes go funny, but I don't really care that much. I'll also go through
the other ports and see if there's any more candidates, I seem to
remember there having been more than just OpenRISC but it's been a
while.

I'm in no big rush for this and given the complex HW dependencies I
think it's best to target it for 5.19, that'd give us a full merge
window for folks to test/benchmark it on their systems to make sure it's
OK.

Is there a specific way you have been testing/benching things, or is it
just a case of test what we ourselves care about?

I do a bunch of functional testing in QEMU (it's all in my riscv-systems-ci repo, but that's not really fit for human consumption so I don't tell folks to use it). That's pretty much useless for something like this: sure it'd find something just straight-up broken in the lock implementation, but the stuff I'm really worried about here would be poor interactions with hardware that wasn't designed/tested against this flavor of locks.

I don't currently do any regular testing on HW, but there's a handful of folks who do. If you've got HW you care about then the best bet is to give this a shot on it. There's already been some boot test reports, so it's at least mostly there (on RISC-V, last I saw it was breaking OpenRISC so there's probably some lurking issue somewhere). I was hoping we'd get enough coverage that way to have confidence in this, but if not then I've got a bunch of RISC-V hardware lying around that I can spin up to fill the gaps.

As far as what workloads, I really don't know here. At least on RISC-V, I think any lock microbenchmarks would be essentially meaningless: this is fair, so even if lock/unlock is a bit slower that's probably a win for real workloads. That said, I'm not sure any of the existing hardware runs any workloads that I'm personally interested in so unless this is some massive hit to just general system responsiveness or make/GCC then I'm probably not going to find anything.

Happy to hear if anyone has ideas, though.


Thanks,
Conor.

RISC-V has a forward progress guarantee so we should be safe, but
these can always trip things up.