Re: [PATCH] lib/vdso: Use __arch_use_vsyscall() to indicate fallback

From: Huacai Chen
Date: Fri Oct 18 2019 - 22:01:32 EST


Hi, Andy,

On Fri, Oct 18, 2019 at 11:15 AM Andy Lutomirski <luto@xxxxxxxxxx> wrote:
>
> On Thu, Oct 17, 2019 at 7:57 PM Huacai Chen <chenhc@xxxxxxxxxx> wrote:
> >
> > In do_hres(), we currently use whether the return value of __arch_get_
> > hw_counter() is negtive to indicate fallback, but this is not a good
> > idea. Because:
> >
> > 1, ARM64 returns ULL_MAX but MIPS returns 0 when clock_mode is invalid;
> > 2, For a 64bit counter, a "negtive" value of counter is actually valid.
>
> s/negtive/negative
>
> What's the actual bug? Is it that MIPS is returning 0 but the check
> is < 0? Sounds like MIPS should get fixed.
My original bug is what Vincenzo said, MIPS has a boot failure if no
valid clock_mode, and surely MIPS need to fix. However, when I try to
fix it, I found that clock_getres() has another problem, because
__cvdso_clock_getres_common() get vd[CS_HRES_COARSE].hrtimer_res, but
hrtimer_res is set in update_vdso_data() which relies on
__arch_use_vsyscall().

>
> >
> > To solve this problem, we use U64_MAX as the only "invalid" return
> > value -- this is still not fully correct, but has no problem in most
> > cases.
>
> I'm sort of okay with that, but...
>
> > Moreover, all vdso time-related functions should rely on the
> > return value of __arch_use_vsyscall(), because update_vdso_data() and
> > update_vsyscall_tz() also rely on it. So, in the core functions of
> > __cvdso_gettimeofday(), __cvdso_clock_gettime() and __cvdso_clock_
> > getres(), if __arch_use_vsyscall() returns false, we use the fallback
> > functions directly.
>
> __arch_use_vsyscall() is not currently intended for use in the vDSO at all.
>
> >
> > Fixes: 00b26474c2f1613d7ab894c5 ("lib/vdso: Provide generic VDSO implementation")
> > Cc: stable@xxxxxxxxxxxxxxx
> > Cc: Arnd Bergmann <arnd@xxxxxxxx>
> > Cc: Paul Burton <paul.burton@xxxxxxxx>
> > Cc: linux-mips@xxxxxxxxxxxxxxx
> > Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
> > Signed-off-by: Huacai Chen <chenhc@xxxxxxxxxx>
> > ---
> > arch/arm64/include/asm/vdso/vsyscall.h | 2 +-
> > arch/mips/include/asm/vdso/vsyscall.h | 2 +-
> > include/asm-generic/vdso/vsyscall.h | 2 +-
> > lib/vdso/gettimeofday.c | 12 +++++++++++-
> > 4 files changed, 14 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/vdso/vsyscall.h b/arch/arm64/include/asm/vdso/vsyscall.h
> > index 0c731bf..406e6de 100644
> > --- a/arch/arm64/include/asm/vdso/vsyscall.h
> > +++ b/arch/arm64/include/asm/vdso/vsyscall.h
> > @@ -31,7 +31,7 @@ int __arm64_get_clock_mode(struct timekeeper *tk)
> > #define __arch_get_clock_mode __arm64_get_clock_mode
> >
> > static __always_inline
> > -int __arm64_use_vsyscall(struct vdso_data *vdata)
> > +int __arm64_use_vsyscall(const struct vdso_data *vdata)
> > {
> > return !vdata[CS_HRES_COARSE].clock_mode;
> > }
> > diff --git a/arch/mips/include/asm/vdso/vsyscall.h b/arch/mips/include/asm/vdso/vsyscall.h
> > index 1953147..8b10dd7 100644
> > --- a/arch/mips/include/asm/vdso/vsyscall.h
> > +++ b/arch/mips/include/asm/vdso/vsyscall.h
> > @@ -29,7 +29,7 @@ int __mips_get_clock_mode(struct timekeeper *tk)
> > #define __arch_get_clock_mode __mips_get_clock_mode
> >
> > static __always_inline
> > -int __mips_use_vsyscall(struct vdso_data *vdata)
> > +int __mips_use_vsyscall(const struct vdso_data *vdata)
> > {
> > return (vdata[CS_HRES_COARSE].clock_mode != VDSO_CLOCK_NONE);
> > }
> > diff --git a/include/asm-generic/vdso/vsyscall.h b/include/asm-generic/vdso/vsyscall.h
> > index e94b1978..ac05a625 100644
> > --- a/include/asm-generic/vdso/vsyscall.h
> > +++ b/include/asm-generic/vdso/vsyscall.h
> > @@ -26,7 +26,7 @@ static __always_inline int __arch_get_clock_mode(struct timekeeper *tk)
> > #endif /* __arch_get_clock_mode */
> >
> > #ifndef __arch_use_vsyscall
> > -static __always_inline int __arch_use_vsyscall(struct vdso_data *vdata)
> > +static __always_inline int __arch_use_vsyscall(const struct vdso_data *vdata)
> > {
> > return 1;
> > }
> > diff --git a/lib/vdso/gettimeofday.c b/lib/vdso/gettimeofday.c
> > index e630e7f..4ad062e 100644
> > --- a/lib/vdso/gettimeofday.c
> > +++ b/lib/vdso/gettimeofday.c
> > @@ -9,6 +9,7 @@
> > #include <linux/hrtimer_defs.h>
> > #include <vdso/datapage.h>
> > #include <vdso/helpers.h>
> > +#include <vdso/vsyscall.h>
> >
> > /*
> > * The generic vDSO implementation requires that gettimeofday.h
> > @@ -50,7 +51,7 @@ static int do_hres(const struct vdso_data *vd, clockid_t clk,
> > cycles = __arch_get_hw_counter(vd->clock_mode);
> > ns = vdso_ts->nsec;
> > last = vd->cycle_last;
> > - if (unlikely((s64)cycles < 0))
> > + if (unlikely(cycles == U64_MAX))
> > return -1;
>
> I would actually prefer:
>
> if (unlikely(cycles < last))
>
> or perhaps:
>
> if (unlikely((s64)(cycles-last) < 0))
>
> which would have the nice side effect of getting rid of the annoying
> x86 special case in vdso_calc_delta(). The former version is
> compatible with U64_MAX, whereas the latter version would need the
> error case to return last-1 or similar. The benefit of the latter
> version is that it can survive wrap-around.
When you say if (unlikely(cycles < last)), do you means if
(unlikely(cycles <= last))? If __arch_get_hw_counter() return U64_MAX
every time, I don't think cycles can be less than last.

Huacai
>
> >
> > ns += vdso_calc_delta(cycles, last, vd->mask, vd->mult);
> > @@ -91,6 +92,9 @@ __cvdso_clock_gettime_common(clockid_t clock, struct __kernel_timespec *ts)
> > if (unlikely((u32) clock >= MAX_CLOCKS))
> > return -1;
> >
> > + if (!__arch_use_vsyscall(vd))
> > + return -1;
> > +
>
> NAK. I don't think this is helpful or correct. It doesn't appear to
> do anything valid, and it's racy.
>
> > /*
> > * Convert the clockid to a bitmask and use it to check which
> > * clocks are handled in the VDSO directly.
> > @@ -145,6 +149,9 @@ __cvdso_gettimeofday(struct __kernel_old_timeval *tv, struct timezone *tz)
> > {
> > const struct vdso_data *vd = __arch_get_vdso_data();
> >
> > + if (!__arch_use_vsyscall(vd))
> > + return gettimeofday_fallback(tv, tz);
> > +
>
> Ditto.
>
> > if (likely(tv != NULL)) {
> > struct __kernel_timespec ts;
> >
> > @@ -189,6 +196,9 @@ int __cvdso_clock_getres_common(clockid_t clock, struct __kernel_timespec *res)
> > if (unlikely((u32) clock >= MAX_CLOCKS))
> > return -1;
> >
> > + if (!__arch_use_vsyscall(vd))
> > + return -1;
> > +
>
> Ditto.