RE: [Patch v11 4/5] x86/cpufeature: enable RING3MWAIT for Knights Landing

From: Andrejczuk, Grzegorz
Date: Thu Dec 22 2016 - 05:19:57 EST


>
> Changelogs should not describe WHAT the patch is doing. We can see that from the patch. Changelogs should describe the WHY and CONCEPTS not implementation details.

So enable for Ring 3 MWAIT for Knights Landing + explanation of model comparison and potential issues below. Should be Ok.

>From your cover letter:
>
> "Removed warning from 32-bit build"
>
>First of all, the warning
>
> arch/x86/include/asm/bitops.h:72:1: note: expected 'volatile long unsigned int *'
>but argument is of type 'unsigned int *'
> set_bit(long nr, volatile unsigned long *addr)
>
>is not at all 32bit specific.
>
>Handing an unsigned int pointer to a function which expects a unsigned long is even more wrong on 64bit.
>
>So now for your 'removal fix': It's just as sloppy as anything else what I've seen from you before.
>
>Handing a typecasted unsigned int pointer to a function which expects an unsigned long pointer is just broken and a clear sign of careless tinkering.

I thought this to be 32 issue because it popped up in 32 build. The reason for this is probably that sizeof(int) is equal to sizeof(long) on x64.
I used the cast following set_cpu_cap define which does exactly the same thing with u32* type.
Is using unsigned long would be OK.

>The only reason why this 'works' is because x86 is a little endian architecture and the bit number is a constant and set_bit gets translated it into:
>
> orb 0x02, 0x0(%rip)
>
>Now if you look really close to that disassembly then you might notice, that this sets bit 1 and not as you tell in patch 2/5:
>
> "Introduce ELF_HWCAP2 variable for x86 and reserve its bit 0 to expose
> the ring 3 MONITOR/MWAIT."
>
> So why does it not set bit 0?
>
> Simply because you hand in HWCAP2_RING3MWAIT as bit number, which is defined as:
>
>+#define HWCAP2_RING3MWAIT (1 << 0)
>
>Crap, crap, crap.
>
> What's so !$@&*(? wrong with doing the simple, obvious and correct:
>
> ELF_HWCAP2 |= HWCAP2_RING3MWAIT;
>
> C is really hard, right?

I used set_bit because I wanted to be sure that this operation to be done atomically. There might be data race when multiple values of ELF_HWCAP2 will be set by multiple threads.
Thanks for the bit 1 issue I missed that. I can define HWCAP_RING3MWAIT_BIT 0 and set it by set_bit?
I could use OR operator as there are no other HWCAP2 values.
I would choose option 1 but as you have seen I have some tendency to write sloppy code and not respond to emails.
But I am willing to change.

Best Regards,
Grzegorz