Re: softlockups in multi_cpu_stop

From: Jason Low
Date: Sat Mar 07 2015 - 01:57:30 EST


On Sat, 2015-03-07 at 13:54 +0800, Ming Lei wrote:
> On Sat, Mar 7, 2015 at 12:31 PM, Jason Low <jason.low2@xxxxxx> wrote:
> > On Fri, 2015-03-06 at 13:12 -0800, Jason Low wrote:
> > Cc: Ming Lei <ming.lei@xxxxxxxxxxxxx>
> > Cc: Davidlohr Bueso <dave@xxxxxxxxxxxx>
> > Signed-off-by: Jason Low <jason.low2@xxxxxx>
>
> Reported-and-tested-by: Ming Lei <ming.lei@xxxxxxxxxxxxx>

Thanks!

> > static noinline
> > bool rwsem_spin_on_owner(struct rw_semaphore *sem, struct task_struct *owner)
> > {
> > long count;
> >
> > rcu_read_lock();
> > - while (owner_running(sem, owner)) {
> > - /* abort spinning when need_resched */
> > - if (need_resched()) {
> > + while (sem->owner == owner) {
> > + /*
> > + * Ensure we emit the owner->on_cpu, dereference _after_
> > + * checking sem->owner still matches owner, if that fails,
> > + * owner might point to free()d memory, if it still matches,
> > + * the rcu_read_lock() ensures the memory stays valid.
> > + */
> > + barrier();
> > +
> > + /* abort spinning when need_resched or owner is not running */
> > + if (!owner->on_cpu || need_resched()) {
>
> BTW, could the need_resched() be handled in loop of
> rwsem_optimistic_spin() directly? Then code may get
> simplified a bit.

We still need the need_resched() check here, since if the thread needs
to reschedule, it should immediately stop spinning for the lock.
Otherwise, it could potentially spin for a long time before it checks
for it needs to reschedule.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/