Re: migration thread and active_load_balance

From: Dan Upton
Date: Tue Apr 22 2008 - 15:19:45 EST


On Mon, Apr 21, 2008 at 4:39 PM, Dmitry Adamushko
<dmitry.adamushko@xxxxxxxxx> wrote:
> On 21/04/2008, Dan Upton <upton.dan.linux@xxxxxxxxx> wrote:
> > On Mon, Apr 21, 2008 at 7:03 AM, Dmitry Adamushko
> >
> > <dmitry.adamushko@xxxxxxxxx> wrote:
> >
> > > On 21/04/2008, Dan Upton <upton.dan.linux@xxxxxxxxx> wrote:
> > > > [ ... ]
> > >
> > > >
> > > > kernel BUG at kernel/sched.c:2103
> > >
> > > and what's this line in your patched sched.c?
> > >
> > > is it -- BUG_ON(!irqs_disabled()); ?
> > >
> > > anything in your unposted code (e.g. find_coolest_cpu()) that might
> > > re-enable the interrupts before __migration_task() is called?
> > >
> > > If you post your modifications as a patch
> > > (Documentation/applying-patches.txt) that contains _all_ relevant
> > > modifications, it'd be easier to guess what's wrong.
> >
> >
> > Yes, that's the line. I don't recall ever reenabling interrupts,
>
> migration_thread() -> find_coolest_cpu() -> get_temperature() ->
> rdmsr_on_cpu() -> [ if your configuration is SMP ] ->
> smp_call_function_single() ->
>
> (arch/x86/kernel/smpcommon.c)
> ...
> if (cpu == me) {
> local_irq_disable();
> func(info);
> local_irq_enable(); <----------- REENABLES the interrupts
> put_cpu();
> return 0;
> }
> ...
>
> as a result, __migrate_task() -> double_rq_lock() -> BUG_ON(!irqs_disabled())
> gives you an "oops".
>

Ah, how about that. Thanks, I at least fixed the oops by caching
return values from get_temperature() and then using those instead of
calling rdmsr_on_cpu when calling from migration_thread(). Everything
works up to the point of uncommenting the new call to
active_load_balance, which again yields a deadlock. (Man, I love
working in the scheduler...) Anyway, I'll keep trying to debug that on
my own again, but did anybody notice anything I'm doing that might
lead to deadlock?

-dan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/