Re: problem with 114 sched.* changes (not the gcc one)

Linus Torvalds (torvalds@transmeta.com)
Fri, 21 Aug 1998 18:55:17 -0700 (PDT)


On Fri, 21 Aug 1998, Ion Badulescu wrote:
>
> 1. if an rpciod is already running at the time amd is started (i.e. after
> doing a remote-server nfs mount by hand), amd will _not_ hang, in fact it
> will run quite happily.
>
> 2. with the restoral of wchan in 2.1.117 I was able to trace the hang a
> little further. The mount hangs in rpciod_up (net/sunrpc/sched.c) in
> sleep_on(&rpciod_idle) whereas the rpciod itself is sleeping in
> interruptible_sleep_on(&rpciod_idle).
>
> My humble oppinion is that, with the scheduling change in 2.1.114, rpciod
> will start running _before_ rpciod_up calls sleep_on() and therefore the
> wake_up() call at the beginning of rpciod becomes ineffective because
> nothing is sleeping on rpciod_idle yet. This is pure speculation, but it
> kind of makes sense. I'm not sure what the right fix is though...

This makes 100% sense, and explains why it was so timing-dependent and
dependent on a scheduler change that should have made no difference at
all.

How does this patch work for you?

Linus

-----
--- v2.1.117/linux/net/sunrpc/sched.c Thu Aug 20 17:05:19 1998
+++ linux/net/sunrpc/sched.c Fri Aug 21 18:52:18 1998
@@ -770,6 +763,8 @@
rpc_inhibit--;
}

+static struct semaphore rpciod_running = MUTEX_LOCKED;
+
/*
* This is the rpciod kernel thread
*/
@@ -786,11 +781,16 @@
* Let our maker know we're running ...
*/
rpciod_pid = current->pid;
- wake_up(&rpciod_idle);
+ up(&rpciod_running);

exit_files(current);
exit_mm(current);
+
+ spin_lock_irq(&current->sigmask_lock);
siginitsetinv(&current->blocked, sigmask(SIGKILL));
+ recalc_sigpending(current);
+ spin_unlock_irq(&current->sigmask_lock);
+
current->session = 1;
current->pgrp = 1;
sprintf(current->comm, "rpciod");
@@ -887,7 +882,7 @@
printk("rpciod_up: create thread failed, error=%d\n", error);
goto out;
}
- sleep_on(&rpciod_idle);
+ down(&rpciod_running);
error = 0;
out:
up(&rpciod_sema);

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.altern.org/andrebalsa/doc/lkml-faq.html