Re: [PATCH v2 1/1] psi: stop relying on timer_pending for poll_work rescheduling

From: Suren Baghdasaryan
Date: Tue Jul 06 2021 - 22:42:52 EST


On Fri, Jul 2, 2021 at 8:49 AM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote:
>
> On Fri, Jul 2, 2021 at 2:28 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >
> > On Thu, Jul 01, 2021 at 09:28:04AM -0700, Suren Baghdasaryan wrote:
> > > On Thu, Jul 1, 2021 at 9:12 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > > >
> > > > On Thu, Jul 01, 2021 at 09:09:25AM -0700, Suren Baghdasaryan wrote:
> > > > > On Thu, Jul 1, 2021 at 1:59 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > > > > >
> > > > > > On Wed, Jun 30, 2021 at 01:51:51PM -0700, Suren Baghdasaryan wrote:
> > > > > > > + /* cmpxchg should be called even when !force to set poll_scheduled */
> > > > > > > + if (atomic_cmpxchg(&group->poll_scheduled, 0, 1) && !force)
> > > > > > > return;
> > > > > >
> > > > > > Why is that a cmpxchg() ?
> > > > >
> > > > > We want to set poll_scheduled and proceed with rescheduling the timer
> > > > > unless it's already scheduled, so cmpxchg helps us to make that
> > > > > decision atomically. Or did I misunderstand your question?
> > > >
> > > > What's wrong with: atomic_xchg(&group->poll_scheduled, 1) ?
> > >
> > > Yes, since poll_scheduled can be only 0 or 1 atomic_xchg should work
> > > fine here. Functionally equivalent but I assume atomic_xchg() is more
> > > efficient due to no comparison.
> >
> > Mostly conceptually simpler; the cmpxchg-on-0 makes that you have to
> > check if there's ever any state outside of {0,1}. The xchg() thing is
> > the classical test-and-set pattern.
> >
> > On top of all that, the cmpxchg() can fail, which brings ordering
> > issues.
>
> Oh, I see. That was my mistake. I was wrongly assuming that all RMW
> atomic operations are fully ordered but indeed, documentation states
> that:
> ```
> - RMW operations that have a return value are fully ordered;
> - RMW operations that are conditional are unordered on FAILURE,
> otherwise the above rules apply.
> ```
> So that's the actual functional difference here. Thanks for catching
> this and educating me!
>
> >
> > Typically, I think, you want to ensure that everything that happens
> > before psi_schedule_poll_work() is visible to the work when it runs
> > (also see Johannes' email).
>
> Correct and I think I understand now the concern Johannes expressed.
>
> > In case poll_scheduled is already 1, the
> > cmpxchg will fail and *NOT* provide that ordering. Meaning the work
> > might not observe the latest changes. xchg() doesn't have this subtlety.
>
> Got it.
> So I think the modifications needed to this patch is:
> 1. replacing atomic_cmpxchg(&group->poll_scheduled, 0, 1) with
> atomic_chg(&group->poll_scheduled, 1)
> 2. an explicit smp_mb() barrier right after
> atomic_set(&group->poll_scheduled, 0) in psi_poll_work().
>
> I think that should ensure the correct ordering here.
> If you folks agree I'll respin v3 with these changes (or maybe I
> should respin and we continue discussion with that version?).

To keep things moving I posted v3
(https://lore.kernel.org/patchwork/patch/1454547) with the changes I
mentioned above. Let's keep discussing it there. Thanks!

>
> >
> > --
> > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@xxxxxxxxxxx.
> >