Re: [RFC] "mustnotsleep"

From: Hugh Dickins
Date: Fri Jun 03 2011 - 23:04:25 EST


On Thu, Jun 2, 2011 at 5:00 PM, Dan Magenheimer
<dan.magenheimer@xxxxxxxxxx> wrote:
>> From: Luis Henriques [mailto:luis.henrix@xxxxxxxxx]
>> Subject: Re: [RFC] "mustnotsleep"
>>
>> Dan Magenheimer <dan.magenheimer@xxxxxxxxxx> writes:
>>
>> > In development of RAMster, I have frequently been bitten
>> > by indirect use of existing kernel subsystems that
>> > unexpectedly sleep. ÂAs such, I have hacked the
>> > following "debug" code fragments for use where I need to
>> > ensure that doesn't happen.
>> >
>> > DEFINE_PER_CPU(int, mustnotsleep_count);
>> >
>> > void mustnotsleep_start(void)
>> > {
>> > Â Â int cpu = smp_processor_id();
>> > Â Â per_cpu(mustnotsleep_count, cpu)++;
>> > }
>> >
>> > void mustnotsleep_done(void)
>> > {
>> > Â Â int cpu = smp_processor_id();
>> > Â Â per_cpu(mustnotsleep_count, cpu)--;
>> > }
>> >
>> > and in schedule.c in schedule():
>> >
>> > if (per_cpu(mustnotsleep_count))
>> > Â Â panic("scheduler called in mustnotsleep code");
>> >
>> > This has enabled me to start identifying code that
>> > is causing me problems. Â(I know this is a horrible
>> > hack, but that's OK right now.)
>>
>> I'm pretty sure I'm missing something here but... what if you just use
>> CONFIG_DEBUG_PREEMPT? ÂIsn't that good enough?
>
> Thanks for the reply Luis.
>
> Looking at the code enabled by CONFIG_DEBUG_PREEMPT,
> I don't think it does what I'm looking for. ÂI need
> to ensure that no code called inside the boundaries
> of mustnotsleep_start/done ever calls the scheduler,
> e.g. cond_resched is never called etc.

Luis is surely on the right lines, but maybe CONFIG_DEBUG_PREEMPT was
not the crucial option.

You should be building with CONFIG_PREEMPT and
CONFIG_DEBUG_SPINLOCK_SLEEP for best testing, and it would be good to
throw in CONFIG_DEBUG_PREEMPT too.

You need to preempt_disable() instead of mustnotsleep_start(), and
preempt_enable() instead of mustnotsleep_done(). You cannot even
safely use their smp_processor_id() without disabling preemption -
otherwise that task might on a different cpu by the time you get to
use the cpu it told you. It's essential to disable preemption if you
cannot bear to be rescheduled.

Then, if I've read the ifdefs correctly (please verify),
cond_resched() will include a __might_sleep() to report "BUG: sleeping
function called from invalid context..." from cond_resched() and other
useful places.

But please don't disable preemption unnecessarily, nor for very long:
it's better to remain preemptible wherever possible.

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/