Re: Loadavg accounting error on arm64

From: Mel Gorman
Date: Mon Nov 16 2020 - 07:58:15 EST


On Mon, Nov 16, 2020 at 01:46:57PM +0100, Peter Zijlstra wrote:
> On Mon, Nov 16, 2020 at 09:10:54AM +0000, Mel Gorman wrote:
> > Similarly, it's not clear why the arm64 implementation
> > does not call smp_acquire__after_ctrl_dep in the smp_load_acquire
> > implementation. Even when it was introduced, the arm64 implementation
> > differed significantly from the arm implementation in terms of what
> > barriers it used for non-obvious reasons.
>
> This is because ARM64's smp_cond_load_acquire() implementation uses
> smp_load_aquire() directly, as opposed to the generic version that uses
> READ_ONCE().
>
> This is because ARM64 has a load-acquire instruction, which is highly
> optimized, and generally considered cheaper than the smp_rmb() from
> smp_acquire__after_ctrl_dep().
>
> Or so I've been led to believe.

Fair enough. Either way, barriering sched_contributes_to_load "works"
but it's clumsy and may not be guaranteed to be correct. The bits
should have been protected by the rq lock but sched_remote_wakeup
updates outside of the lock which might be leading to the adject fields
(like sched_contributes_to_load) getting corrupted as per the "anti
guarantees" in memory-barriers.txt. The rq lock could be conditionally
acquired __ttwu_queue_wakelist for WF_MIGRATED and explicitly cleared in
sched_ttwu_pending (not tested if this works) but it would also suck to
acquire a remote lock when that's what we're explicitly trying to avoid
in that path.

--
Mel Gorman
SUSE Labs