Re: [RFC][PATCH 12/13] stop_machine: Remove lglock

From: Peter Zijlstra
Date: Wed Jun 24 2015 - 11:35:10 EST


On Wed, Jun 24, 2015 at 08:01:29AM -0700, Paul E. McKenney wrote:
> On Wed, Jun 24, 2015 at 10:32:57AM +0200, Peter Zijlstra wrote:
> > On Tue, Jun 23, 2015 at 07:23:44PM -0700, Paul E. McKenney wrote:
> > > And here is an untested patch that applies the gist of your approach,
> > > the series of stop_one_cpu() calls, but without undoing the rest.
> > > I forged your Signed-off-by, please let me know if that doesn't work
> > > for you. There are a number of simplifications that can be made, but
> > > the basic approach gets a good testing first.
> >
> > So I really do not get the point of the trylock. It doesn't make sense.
> >
> > Why would you poll the mutex instead of just wait for it and then
> > recheck if someone did the work while you were waiting for it?
> >
> > What's wrong with the below?
>
> Various delays can cause tasks to queue on the mutex out of order.

If the mutex owner sleeps, mutexes are FIFO, otherwise things can get
iffy indeed.

> This can cause a given task not only to have been delayed between
> sampling ->expedited_start and the mutex_lock(), but be further delayed
> because tasks granted the mutex earlier will wait on grace periods that
> the delayed task doesn't need to wait on. These extra waits are simply
> not consistent with the "expedited" in synchronize_sched_expedited().

Feh, I really do not know if its worth optimizing the concurrent
expedited case, but we could just make it an open-coded mutex that's
strictly FIFO. A waitqueue on the done variable might be sufficient.

That's still tons better than polling.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/