Re: linux-next-20110923: warning kernel/rcutree.c:1833

From: Frederic Weisbecker
Date: Mon Oct 03 2011 - 13:12:07 EST


On Mon, Oct 03, 2011 at 09:22:21AM -0700, Paul E. McKenney wrote:
> On Mon, Oct 03, 2011 at 02:59:03PM +0200, Frederic Weisbecker wrote:
> > On Sun, Oct 02, 2011 at 05:28:32PM -0700, Paul E. McKenney wrote:
> > > On Mon, Oct 03, 2011 at 12:50:22AM +0200, Frederic Weisbecker wrote:
> > > > On Fri, Sep 30, 2011 at 12:24:38PM -0700, Paul E. McKenney wrote:
> > > > > @@ -328,11 +326,11 @@ static int rcu_implicit_offline_qs(struct rcu_data *rdp)
> > > > > return 1;
> > > > > }
> > > > >
> > > > > - /* If preemptible RCU, no point in sending reschedule IPI. */
> > > > > - if (rdp->preemptible)
> > > > > - return 0;
> > > > > -
> > > > > - /* The CPU is online, so send it a reschedule IPI. */
> > > > > + /*
> > > > > + * The CPU is online, so send it a reschedule IPI. This forces
> > > > > + * it through the scheduler, and (inefficiently) also handles cases
> > > > > + * where idle loops fail to inform RCU about the CPU being idle.
> > > > > + */
> > > >
> > > > If the idle loop forgets to call rcu_idle_enter() before going to
> > > > sleep, I don't know if it's a good idea to try to cure that situation
> > > > by forcing a quiescent state remotely. It may make the thing worse
> > > > because we actually won't notice the lack of call to rcu_idle_enter()
> > > > that the rcu stall detector would otherwise report to us.
> > > >
> > > > Also I don't think that works. If the task doesn't have
> > > > TIF_RESCHED, it won't go through the scheduler on irq exit.
> > > > smp_send_reschedule() doesn't set the flag. And also scheduler_ipi()
> > > > returns right away if no wake up is pending.
> > > >
> > > > So, other than resuming the idle loop to sleep again, nothing may happen.
> > > >
> > > > Or am I missing something?
> > >
> > > Hmmm... Seems like the IPIs aren't helping in any case, then?
> >
> > I thought it was there for !PREEMPT cases where the task has TIF_RESCHED
> > but takes too much time to find an opportunity to go to sleep.
>
> Indeed, and it might be worth leaving in for that.

Now I realize it's not even helpful in that case. If you're having a long
time in the kernel without calling schedule(), an IPI won't be very useful
on that.

No, the current call looks useless to me :)

> > > I suppose that I could do an smp_call_function_single(), which then
> > > did a set_need_resched()...
> > >
> > > But this is a separate issue that I need to deal with. That said, any
> > > suggestions are welcome!
> >
> > Note you can't call smp_call_function_*() while irqs are disabled.
>
> Sigh! This isn't the first time this year that I have forgotten that,
> is it?
>
> > Perhaps you need something like kernel/sched.c:resched_cpu()
> > This adds some rq->lock contention though.
>
> This would happen infrequently, and could be made to be event more
> infrequent. But I wonder what happens when you do this to a CPU
> that is running the idle task? Seems like it should work normally,
> but...

That should work as well. But I think we shouldn't send an IPI
with TIF_RESCHED set along to a remote CPU that is running idle.

If there is a missing rcu_idle_enter() call, we should report it (rcu
stall) and fix it. Not trying to cure the consequences. Sending an IPI
would make it harder to find such bugs.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/