Sorry, found it in my inbox while clearing out backlog..Hit this issue when testing my perf_arch_regs patchset. Yep exactly
On Sun, Jul 03, 2016 at 11:31:58PM +0530, Madhavan Srinivasan wrote:
When decoding the perf_regs mask in perf_output_sample_regs(),But it looks an awful lot like it..
we loop through the mask using find_first_bit and find_next_bit functions.
While the exisitng code works fine in most of the case,
the logic is broken for 32bit kernel (Big Endian).
When reading u64 mask using (u32 *)(&val)[0], find_*_bit() assumes it gets
lower 32bits of u64 but instead gets upper 32bits which is wrong.
Proposed fix is to swap the words of the u64 to handle this case.
This is _not_ endianness swap.
+++ b/kernel/events/core.cLooks small enough for an inline.
@@ -5205,8 +5205,10 @@ perf_output_sample_regs(struct perf_output_handle *handle,
struct pt_regs *regs, u64 mask)
{
int bit;
+ DECLARE_BITMAP(_mask, 64);
- for_each_set_bit(bit, (const unsigned long *) &mask,
+ bitmap_from_u64(_mask, mask);
+ for_each_set_bit(bit, _mask,
sizeof(mask) * BITS_PER_BYTE) {
u64 val;
+++ b/lib/bitmap.c
+void bitmap_from_u64(unsigned long *dst, u64 mask)
+{
+ dst[0] = mask & ULONG_MAX;
+
+ if (sizeof(mask) > sizeof(unsigned long))
+ dst[1] = mask >> 32;
+}
+EXPORT_SYMBOL(bitmap_from_u64);
Alternatively you can go all the way and add bitmap_from_u64array(), but
that seems massive overkill.
Tedious stuff.. I can't come up with anything prettier :/