Re: [Patch] jbd commit code deadloop when installing Linux

From: Andrew Morton
Date: Wed Jun 28 2006 - 04:03:40 EST


On 28 Jun 2006 14:02:57 +0800
Zou Nan hai <nanhai.zou@xxxxxxxxx> wrote:

> > > However I think cond_resched_lock and cond_resched_softirq also need fix
> > > to make the semantic consistent.
> > >
> > > Please check the following patch.
> > >
> >
> > Ah. I think the return value from these functions should mean "something
> > disruptive happened", if you like.
> >
> > See, the callers of cond_resched_lock() aren't interested in whether
> > cond_resched_lock() actually called schedule(). They want to know whether
> > cond_resched_lock() dropped the lock. Because if the lock was dropped, the
> > caller needs to take some special action, regardless of whether schedule()
> > was finally called.
> >
> > So I think the patch I queued is OK, agree?
>
> I am afraid the code like cond_resched_lock check in
> fs/jbd/checkpoint.c log_do_checkpoint may fall into endless retry in
> some condition, will it?

Oh crap, yes. If need_resched() and system_state==SYSTEM_BOOTING then
cond_resched_lock() will drop the lock but won't schedule. So it'll return
true but won't clear need_resched() and the caller will lock up.

So if cond_resched_foo() ends up dropping the lock it _must_ call
schedule() to clear need_resched().

So, how about this (it needs some code comments!)


diff -puN kernel/sched.c~cond_resched-fix kernel/sched.c
--- a/kernel/sched.c~cond_resched-fix
+++ a/kernel/sched.c
@@ -4386,6 +4386,15 @@ asmlinkage long sys_sched_yield(void)
return 0;
}

+static inline int __resched_legal(void)
+{
+ if (unlikely(preempt_count()))
+ return 0;
+ if (unlikely(system_state != SYSTEM_RUNNING))
+ return 0;
+ return 1;
+}
+
static inline void __cond_resched(void)
{
#ifdef CONFIG_DEBUG_SPINLOCK_SLEEP
@@ -4396,10 +4405,6 @@ static inline void __cond_resched(void)
* PREEMPT_ACTIVE, which could trigger a second
* cond_resched() call.
*/
- if (unlikely(preempt_count()))
- return;
- if (unlikely(system_state != SYSTEM_RUNNING))
- return;
do {
add_preempt_count(PREEMPT_ACTIVE);
schedule();
@@ -4409,13 +4414,12 @@ static inline void __cond_resched(void)

int __sched cond_resched(void)
{
- if (need_resched()) {
+ if (need_resched() && __resched_legal()) {
__cond_resched();
return 1;
}
return 0;
}
-
EXPORT_SYMBOL(cond_resched);

/*
@@ -4436,7 +4440,7 @@ int cond_resched_lock(spinlock_t *lock)
ret = 1;
spin_lock(lock);
}
- if (need_resched()) {
+ if (need_resched() && __resched_legal()) {
_raw_spin_unlock(lock);
preempt_enable_no_resched();
__cond_resched();
@@ -4445,14 +4449,13 @@ int cond_resched_lock(spinlock_t *lock)
}
return ret;
}
-
EXPORT_SYMBOL(cond_resched_lock);

int __sched cond_resched_softirq(void)
{
BUG_ON(!in_softirq());

- if (need_resched()) {
+ if (need_resched() && __resched_legal()) {
__local_bh_enable();
__cond_resched();
local_bh_disable();
@@ -4460,10 +4463,8 @@ int __sched cond_resched_softirq(void)
}
return 0;
}
-
EXPORT_SYMBOL(cond_resched_softirq);

-
/**
* yield - yield the current processor to other threads.
*
_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/