Re: [PATCH v8 15/19] locking/rwsem: Adaptive disabling of reader optimistic spinning

From: Linus Torvalds
Date: Wed Jun 05 2019 - 16:56:40 EST


On Wed, Jun 5, 2019 at 1:19 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> Urgh, that's another things that's been on the TODO list for a long long
> time, write code to verify the alignment of allocations :/ I'm
> suspecting quite a lot of that goes wrong all over the place.

On x86, we only guarantee 8-byte alignment from things like kmalloc(), iirc.

That ends up actually being a useful thing for small allocations,
which do happen.

On the whole, I would suggest against cmpxchg2 unless it's something
_really_ special. And would definitely strongly suggest against it for
something like a rwsem. Requiring 16-byte alignment just because your
data structure has a lock is nasty. Of course, we could probably
fairly easily change our kmalloc alignment rules to be "still just 8
bytes for small allocations, 16 bytes for anything that is >=64 bytes"
or whatever.

At least nobody is hopefully crazy enough to put one of those things
on the stack, where we *definitely* don't want to increase alignment
issues.

And before people say "surely small allocations aren't normal" - take
a look at slaballoc. Small allocations (<= 32 bytes) are actually not
all that uncommon, and you want them dense in the cache and dense in
memory to not waste either. arm64 has some insane alignment issues
(128 byte alignment due to DMA coherency issues, iirc), and it hurts
them badly.

Right now my machine has 400k 8-byte allocations, if I read things right.

You also find some core slab caches that are small and that don't need
16-byte alignment. A quick script finds things like
ext4_extent_status, which is 40 bytes, not horribly uncommon (I've
apparently got 250k of those things on my system), and currently fits
102 entries per page *because* it's not excessively aligned. Or
Acpi-Parse, which I apparently have 350k of, and is 56 bytes, and fits
73 per page exactly because it only needs 8-byte alignment (but
admittedly a 16-byte alignment would waste some memory, but guarantee
it doesn't cross a cacheline, so _maybe_ it would be ok).

16-byte alignment really isn't a good idea when you have data sizes
that are clearly smaller than even a cacheline.

So I *really* don't want to force excessive alignment. We'd have to
add some special static tooling to say "this kmalloc is assigned to a
pointer which requires 16-byte alignment" and make it use a separate
slab cache with that explicit alignment for that.

Linus