Re: [PATCH v2 5/6] RISC-V: hwprobe: Support probing of misaligned access performance
From: Evan Green
Date: Fri Feb 17 2023 - 19:16:18 EST
On Wed, Feb 15, 2023 at 1:57 PM Conor Dooley <conor@xxxxxxxxxx> wrote:
>
> On Mon, Feb 06, 2023 at 12:14:54PM -0800, Evan Green wrote:
> > This allows userspace to select various routines to use based on the
> > performance of misaligned access on the target hardware.
> >
> > Co-developed-by: Palmer Dabbelt <palmer@xxxxxxxxxxxx>
> > Signed-off-by: Palmer Dabbelt <palmer@xxxxxxxxxxxx>
> > Signed-off-by: Evan Green <evan@xxxxxxxxxxxx>
> >
> > ---
> >
> > Changes in v2:
> > - Fixed logic error in if(of_property_read_string...) that caused crash
> > - Include cpufeature.h in cpufeature.h to avoid undeclared variable
> > warning.
> > - Added a _MASK define
> > - Fix random checkpatch complaints
> >
> > Documentation/riscv/hwprobe.rst | 13 +++++++++++
> > arch/riscv/include/asm/cpufeature.h | 2 ++
> > arch/riscv/include/asm/hwprobe.h | 2 +-
> > arch/riscv/include/asm/smp.h | 9 ++++++++
> > arch/riscv/include/uapi/asm/hwprobe.h | 6 ++++++
> > arch/riscv/kernel/cpufeature.c | 31 +++++++++++++++++++++++++--
> > arch/riscv/kernel/sys_riscv.c | 23 ++++++++++++++++++++
> > 7 files changed, 83 insertions(+), 3 deletions(-)
> >
> > diff --git a/Documentation/riscv/hwprobe.rst b/Documentation/riscv/hwprobe.rst
> > index ce186967861f..0dc75e83e127 100644
> > --- a/Documentation/riscv/hwprobe.rst
> > +++ b/Documentation/riscv/hwprobe.rst
> > @@ -51,3 +51,16 @@ The following keys are defined:
> > not minNum/maxNum") of the RISC-V ISA manual.
> > * :RISCV_HWPROBE_IMA_C:: The C extension is supported, as defined by
> > version 2.2 of the RISC-V ISA manual.
> > +* :RISCV_HWPROBE_KEY_PERF_0:: A bitmask that contains performance information
>
> This doesn't match what's defined?
>
> > + about the selected set of processors.
> > + * :RISCV_HWPROBE_MISALIGNED_UNKNOWN:: The performance of misaligned
> > + accesses is unknown.
> > + * :RISCV_HWPROBE_MISALIGNED_EMULATED:: Misaligned accesses are emulated via
> > + software, either in or below the kernel. These accesses are always
> > + extremely slow.
> > + * :RISCV_HWPROBE_MISALIGNED_SLOW:: Misaligned accesses are supported in
> > + hardware, but are slower than the cooresponding aligned accesses
> > + sequences.
> > + * :RISCV_HWPROBE_MISALIGNED_FAST:: Misaligned accesses are supported in
> > + hardware and are faster than the cooresponding aligned accesses
> > + sequences.
>
> > diff --git a/arch/riscv/include/asm/smp.h b/arch/riscv/include/asm/smp.h
> > index 3831b638ecab..6c1759091e44 100644
> > --- a/arch/riscv/include/asm/smp.h
> > +++ b/arch/riscv/include/asm/smp.h
> > @@ -26,6 +26,15 @@ struct riscv_ipi_ops {
> > */
> > extern unsigned long __cpuid_to_hartid_map[NR_CPUS];
> > #define cpuid_to_hartid_map(cpu) __cpuid_to_hartid_map[cpu]
> > +static inline long hartid_to_cpuid_map(unsigned long hartid)
> > +{
> > + long i;
> > +
> > + for (i = 0; i < NR_CPUS; ++i)
>
> I'm never (or not yet?) sure about these things.
> Should this be for_each_possible_cpu()?
Me neither. I believe it's the same, as for_each_possible_cpu()
iterates over a CPU mask of all 1s, and the size of struct cpumask is
set by NR_CPUS. Some architectures appear to have an
init_cpu_possible() function to further restrict the set, though riscv
does not. It's probably better to use for_each_possible_cpu() though
in case a call to init_cpu_possible() ever does get added.
>
> > + if (cpuid_to_hartid_map(i) == hartid)
> > + return i;
> > + return -1;
> > +}
> >
> > /* print IPI stats */
> > void show_ipi_stats(struct seq_file *p, int prec);
> > diff --git a/arch/riscv/include/uapi/asm/hwprobe.h b/arch/riscv/include/uapi/asm/hwprobe.h
> > index ce39d6e74103..5d55e2da2b1f 100644
> > --- a/arch/riscv/include/uapi/asm/hwprobe.h
> > +++ b/arch/riscv/include/uapi/asm/hwprobe.h
> > @@ -25,5 +25,11 @@ struct riscv_hwprobe {
> > #define RISCV_HWPROBE_KEY_IMA_EXT_0 4
> > #define RISCV_HWPROBE_IMA_FD (1 << 0)
> > #define RISCV_HWPROBE_IMA_C (1 << 1)
> > +#define RISCV_HWPROBE_KEY_CPUPERF_0 5
> > +#define RISCV_HWPROBE_MISALIGNED_UNKNOWN (0 << 0)
> > +#define RISCV_HWPROBE_MISALIGNED_EMULATED (1 << 0)
> > +#define RISCV_HWPROBE_MISALIGNED_SLOW (2 << 0)
> > +#define RISCV_HWPROBE_MISALIGNED_FAST (3 << 0)
> > +#define RISCV_HWPROBE_MISALIGNED_MASK (3 << 0)
>
> Why is it UNKNOWN rather than UNSUPPORTED?
> I thought I saw Palmer saying that there is no requirement to support
> misaligned accesses any more.
> Plenty of old DTs are going to lack this property so would be UNKNOWN,
> and I *assume* that the user of the syscall is gonna conflate the two,
> but the rationale interests me.
Palmer had mentioned on the DT bindings patch that historically it was
required but emulated. So because old binaries assumed it was there,
the default values for DTs without this needs to imply "supported, but
no idea how fast it is".
But you bring up an interesting point: should the bindings and these
defines have a value that indicates no support at all for unaligned
accesses? We could always add the value to the bindings later, but
maybe we should leave space in this field now.
-Evan