Re: net/sched: latent livelock in dev_deactivate_many() due to yield() usage

From: Mike Galbraith
Date: Wed Apr 05 2017 - 21:09:21 EST


On Wed, 2017-04-05 at 16:55 -0700, Cong Wang wrote:
> On Tue, Apr 4, 2017 at 11:12 PM, Mike Galbraith <efault@xxxxxx> wrote:
> > On Tue, 2017-04-04 at 22:25 -0700, Cong Wang wrote:
> > > On Tue, Apr 4, 2017 at 8:20 PM, Mike Galbraith <efault@xxxxxx> wrote:
> > > > - while (some_qdisc_is_busy(dev))
> > > > - yield();
> > > > + swait_event_timeout(swait,
> > > > !some_qdisc_is_busy(dev), 1);
> > > > }
> > >
> > > I don't see why this is an improvement even if I don't care about the
> > > hardcoded timeout for now... Why the scheduler can make a better
> > > decision with swait_event_timeout() than with cond_resched()?
> >
> > Because sleeping gets you out of the way? There is no other decision
> > the scheduler can make while a SCHED_FIFO task is trying to yield when
> > it is the one and only task at it's priority. The scheduler is doing
> > exactly what it is supposed to do, problem is people calling yield()
> > tend to think it does something it does not do, which is why it is
> > decorated with "if you think you want yield(), think again"
> >
> > Yes, yield semantics suck rocks, basically don't exist. Hop in your
> > time machine and slap whoever you find claiming responsibility :)
>
> I am not trying to defend for yield(), I am trying to understand when
> cond_resched() is not a right solution to replace yield() and when it is.
> For me, the dev_deactivate_many() case is, because I interpret
> "be nice" differently.

Yeah, I know you weren't defending it, just as I know that the net-fu
masters don't need that comment held close to their noses in order to
be able to read it.. waving it about wasn't for their benefit ;-)

-Mike