Re: [PATCH] RISC-V: hwprobe: Add MISALIGNED_PERF key

From: Palmer Dabbelt
Date: Mon Jun 03 2024 - 13:57:37 EST


On Wed, 29 May 2024 20:36:45 PDT (-0700), cyy@xxxxxxxxxxxx wrote:
On 2024/5/30 02:26, Evan Green wrote:
RISCV_HWPROBE_KEY_CPUPERF_0 was mistakenly flagged as a bitmask in
hwprobe_key_is_bitmask(), when in reality it was an enum value. This
causes problems when used in conjunction with RISCV_HWPROBE_WHICH_CPUS,
since SLOW, FAST, and EMULATED have values whose bits overlap with
each other. If the caller asked for the set of CPUs that was SLOW or
EMULATED, the returned set would also include CPUs that were FAST.

Introduce a new hwprobe key, RISCV_HWPROBE_KEY_MISALIGNED_PERF, which
returns the same values in response to a direct query (with no flags),
but is properly handled as an enumerated value. As a result, SLOW,
FAST, and EMULATED are all correctly treated as distinct values under
the new key when queried with the WHICH_CPUS flag.

Leave the old key in place to avoid disturbing applications which may
have already come to rely on the broken behavior.

Fixes: e178bf146e4b ("RISC-V: hwprobe: Introduce which-cpus flag")
Signed-off-by: Evan Green <evan@xxxxxxxxxxxx>

---


Note: Yangyu also has a fix out for this issue at [1]. That fix is much
tidier, but comes with the slight risk that some very broken userspace
application may break now that FAST cpus are not included for the query
of which cpus are SLOW or EMULATED.

Indeed. Since the value of FAST is 0b11, the SLOW and EMULATED are 0b10 and
0b01 respectively.

When this key is treated as a bitmask and query with
RISCV_HWPROBE_WHICH_CPUS if a CPU has a superset bitmask of the requested
value on the requested key, it will remain in the CPU mask. Otherwise, the
CPU will be clear in the CPU mask. But when a key is treated as a value, we
will just do a comparison. if it is not equal, then the CPU will be clear
in the CPU. That's why FAST cpus are included when querying with SLOW or
EMULATED with RISCV_HWPROBE_KEY_CPUPERF_0 key now.

For me, deprecating the original hwprobe key and introducing a new key
would be a better solution than changing the behavior as my patch did.

OK. I don't have a strong feeling either way: if someone has code that tries to read this as a btimask then it'd be broken, but it would technically be following the docs.

That said, we're relying on this as a pretty core userspace portability construct. So maybe the right answer here is to just be really strict about compatibility and eat the pain when we make a mistake, just to make sure we set the right example about not breaking stuff.

So unless anyone's opposed, I'll pick this up for 6.11.

I wanted to get this fix out so that
we have both as options, and can discuss. These fixes are mutually
exclusive, don't take both.

It's better to note this strange behavior on
Documentation/arch/riscv/hwprobe.rst so users can quickly understand the
differences on the behavior of these two keys.

The C code part looks good to me.


[1] https://lore.kernel.org/linux-riscv/tencent_01F8E0050FB4B11CC170C3639E43F41A1709@xxxxxx/

---
Documentation/arch/riscv/hwprobe.rst | 8 ++++++--
arch/riscv/include/asm/hwprobe.h | 2 +-
arch/riscv/include/uapi/asm/hwprobe.h | 1 +
arch/riscv/kernel/sys_hwprobe.c | 1 +
4 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/Documentation/arch/riscv/hwprobe.rst b/Documentation/arch/riscv/hwprobe.rst
index 204cd4433af5..616ee372adaf 100644
--- a/Documentation/arch/riscv/hwprobe.rst
+++ b/Documentation/arch/riscv/hwprobe.rst
@@ -192,8 +192,12 @@ The following keys are defined:
supported as defined in the RISC-V ISA manual starting from commit
d8ab5c78c207 ("Zihintpause is ratified").

-* :c:macro:`RISCV_HWPROBE_KEY_CPUPERF_0`: A bitmask that contains performance
- information about the selected set of processors.
+* :c:macro:`RISCV_HWPROBE_KEY_CPUPERF_0`: Deprecated. Returns similar values to
+ :c:macro:`RISCV_HWPROBE_KEY_MISALIGNED_PERF`, but the key was mistakenly
+ classified as a bitmask rather than a value.
+
+* :c:macro:`RISCV_HWPROBE_KEY_MISALIGNED_PERF`: An enum value describing the
+ performance of misaligned scalar accesses on the selected set of processors.

* :c:macro:`RISCV_HWPROBE_MISALIGNED_UNKNOWN`: The performance of misaligned
accesses is unknown.
diff --git a/arch/riscv/include/asm/hwprobe.h b/arch/riscv/include/asm/hwprobe.h
index 630507dff5ea..150a9877b0af 100644
--- a/arch/riscv/include/asm/hwprobe.h
+++ b/arch/riscv/include/asm/hwprobe.h
@@ -8,7 +8,7 @@

#include <uapi/asm/hwprobe.h>

-#define RISCV_HWPROBE_MAX_KEY 6
+#define RISCV_HWPROBE_MAX_KEY 7

static inline bool riscv_hwprobe_key_is_valid(__s64 key)
{
diff --git a/arch/riscv/include/uapi/asm/hwprobe.h b/arch/riscv/include/uapi/asm/hwprobe.h
index dda76a05420b..bc34e33fef23 100644
--- a/arch/riscv/include/uapi/asm/hwprobe.h
+++ b/arch/riscv/include/uapi/asm/hwprobe.h
@@ -68,6 +68,7 @@ struct riscv_hwprobe {
#define RISCV_HWPROBE_MISALIGNED_UNSUPPORTED (4 << 0)
#define RISCV_HWPROBE_MISALIGNED_MASK (7 << 0)
#define RISCV_HWPROBE_KEY_ZICBOZ_BLOCK_SIZE 6
+#define RISCV_HWPROBE_KEY_MISALIGNED_PERF 7
/* Increase RISCV_HWPROBE_MAX_KEY when adding items. */

/* Flags */
diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c
index 969ef3d59dbe..c8b7d57eb55e 100644
--- a/arch/riscv/kernel/sys_hwprobe.c
+++ b/arch/riscv/kernel/sys_hwprobe.c
@@ -208,6 +208,7 @@ static void hwprobe_one_pair(struct riscv_hwprobe *pair,
break;

case RISCV_HWPROBE_KEY_CPUPERF_0:
+ case RISCV_HWPROBE_KEY_MISALIGNED_PERF:
pair->value = hwprobe_misaligned(cpus);
break;