Re: [PATCH v2] RISC-V: Check clint_time_val before use

From: Anup Patel
Date: Sun Sep 27 2020 - 01:38:37 EST


On Sun, Sep 27, 2020 at 5:50 AM Palmer Dabbelt <palmerdabbelt@xxxxxxxxxx> wrote:
>
> On Sat, 26 Sep 2020 03:31:29 PDT (-0700), Damien Le Moal wrote:
> > On Sat, 2020-09-26 at 15:51 +0530, Anup Patel wrote:
> >> The NoMMU kernel is broken for QEMU virt machine from Linux-5.9-rc6
> >> because the get_cycles() and friends are called very early from
> >> rand_initialize() before CLINT driver is probed. To fix this, we
> >> should check clint_time_val before use in get_cycles() and friends.
>
> I don't think this is the right way to solve that problem, as we're essentially
> just lying about the timer rather than informing the system we can't get
> timer-based entropy right now. MIPS is explicit about this, I don't see any
> reason why we shouldn't be as well.
>
> Does this fix the boot issue (see below for the NULL)? There's some other
> random-related arch functions so this might not be quite the right way to do
> it.
>
> diff --git a/arch/riscv/include/asm/timex.h b/arch/riscv/include/asm/timex.h
> index 7f659dda0032..7e39b0068932 100644
> --- a/arch/riscv/include/asm/timex.h
> +++ b/arch/riscv/include/asm/timex.h
> @@ -33,6 +33,18 @@ static inline u32 get_cycles_hi(void)
> #define get_cycles_hi get_cycles_hi
> #endif /* CONFIG_64BIT */
>
> +/*
> + * Much like MIPS, we may not have a viable counter to use at an early point in
> + * the boot process. Unfortunately we don't have a fallback, so instead we
> + * just return 0.
> + */
> +static inline unsigned long random_get_entropy(void)
> +{
> + if (unlikely(clint_time_val == NULL))
> + return 0;
> + return get_cycles();
> +}
> +
> #else /* CONFIG_RISCV_M_MODE */
>
> static inline cycles_t get_cycles(void)
>
> >> Fixes: d5be89a8d118 ("RISC-V: Resurrect the MMIO timer implementation
> >> for M-mode systems")
> >> Signed-off-by: Anup Patel <anup.patel@xxxxxxx>
> >> ---
> >> Changes since v1:
> >> - Explicitly initialize clint_time_val to NULL in CLINT driver to
> >> avoid hang on Kendryte K210
> >> ---
> >> arch/riscv/include/asm/timex.h | 12 +++++++++---
> >> drivers/clocksource/timer-clint.c | 2 +-
> >> 2 files changed, 10 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/arch/riscv/include/asm/timex.h b/arch/riscv/include/asm/timex.h
> >> index 7f659dda0032..6e7b04874755 100644
> >> --- a/arch/riscv/include/asm/timex.h
> >> +++ b/arch/riscv/include/asm/timex.h
> >> @@ -17,18 +17,24 @@ typedef unsigned long cycles_t;
> >> #ifdef CONFIG_64BIT
> >> static inline cycles_t get_cycles(void)
> >> {
> >> - return readq_relaxed(clint_time_val);
> >> + if (clint_time_val)
> >> + return readq_relaxed(clint_time_val);
> >> + return 0;
> >> }
> >> #else /* !CONFIG_64BIT */
> >> static inline u32 get_cycles(void)
> >> {
> >> - return readl_relaxed(((u32 *)clint_time_val));
> >> + if (clint_time_val)
> >> + return readl_relaxed(((u32 *)clint_time_val));
> >> + return 0;
> >> }
> >> #define get_cycles get_cycles
> >>
> >> static inline u32 get_cycles_hi(void)
> >> {
> >> - return readl_relaxed(((u32 *)clint_time_val) + 1);
> >> + if (clint_time_val)
> >> + return readl_relaxed(((u32 *)clint_time_val) + 1);
> >> + return 0;
> >> }
> >> #define get_cycles_hi get_cycles_hi
> >> #endif /* CONFIG_64BIT */
> >> diff --git a/drivers/clocksource/timer-clint.c b/drivers/clocksource/timer-clint.c
> >> index d17367dee02c..8dbec85979fd 100644
> >> --- a/drivers/clocksource/timer-clint.c
> >> +++ b/drivers/clocksource/timer-clint.c
> >> @@ -37,7 +37,7 @@ static unsigned long clint_timer_freq;
> >> static unsigned int clint_timer_irq;
> >>
> >> #ifdef CONFIG_RISCV_M_MODE
> >> -u64 __iomem *clint_time_val;
> >> +u64 __iomem *clint_time_val = NULL;
>
> This one I definately don't get. According the internet, the C standard says
>
> If an object that has static storage duration is not initialized
> explicitly, it is initialized implicitly as if every member that has
> arithmetic type were assigned 0 and every member that has pointer type were
> assigned a null pointer constant.
>
> so unless I'm missing something there shouldn't be any difference between these
> two lines. When I just apply this I get exactly the same "objdump -dt" before
> and after. I do see some difference in assembly, but only when I don't pass
> "-fno-common" and that ends up being passed during my Linux builds.

Even checkpatch complains about explicit NULL assignment to global variables.

I will send v3 with a different approach to fix this so hopefully with
v3 we will
not require explicit NULL assignment.

Regards,
Anup

>
> >> #endif
> >>
> >> static void clint_send_ipi(const struct cpumask *target)
> >
> > For Kendryte:
> >
> > Tested-by: Damien Le Moal <damien.lemoal@xxxxxxx>
> >
> > --
> > Damien Le Moal
> > Western Digital