Re: [PATCH 2/3] soc: amazon: al-pos: Introduce Amazon's Annapurna Labs POS driver
From: Arnd Bergmann
Date: Mon Sep 09 2019 - 09:41:44 EST
On Mon, Sep 9, 2019 at 1:13 PM Shenhar, Talel <talel@xxxxxxxxxx> wrote:
> On 9/9/2019 12:44 PM, Arnd Bergmann wrote:
> > On Mon, Sep 9, 2019 at 11:14 AM Talel Shenhar <talel@xxxxxxxxxx> wrote:
> >> + writel_relaxed(0, pos->mmio_base + AL_POS_ERROR_LOG_1);
> > Why do you require _relaxed() accessors here? Please add a comment
> > explaining that, or use the regular readl()/writel().
>
> I don't think commenting is needed here as there is nothing special in
> this type of access.
>
> I don't see this is common to comment the use of the _relaxed accessors.
I usually mention it in driver reviews, but most authors revert back
to the normal accessors when there is no difference.
> This driver is for SoC using arm64 cpu.
>
> If one uses the non-relaxed version of readl while running on arm64, he
> shall cause read barrier, which is then doing dsm(ld).. This barrier is
> not needed here, so we spare the use of the more heavy readl in favor of
> the less "harmful" one.
>
> Let me know what you think.
If the barrier causes no harm, just leave it in to keep the code more
readable. Most developers don't need to know the difference between
the two, so using the less common interface just makes the reader
curious about why it was picked.
Avoiding the barrier can make a huge performance difference in a
hot code path, but the downside is that it can behave in unexpected
ways if the same code is run on a different CPU architecture that
does not have the exact same rules about what _relaxed() means.
In fact, replacing a 'readl()' with 'readl_relaxed() + rmb()' can lead
to slower rather than faster code when the explicit barrier is heavier
than the implied one (e.g. on x86), or readl_relaxed() does not skip
the barrier.
The general rule with kernel interfaces when you have two versions
that both do what you want is to pick the one with the shorter name.
See spin_lock()/spin_lock_irqsave(), ioremap()/ioremap_nocache(),
or ktime_get()/ktime_get_clocktai_ts64(). (yes, there are also
exceptions)
Arnd