RE: O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks

From: Perez-Gonzalez, Inaky (inaky.perez-gonzalez@intel.com)
Date: Thu Jun 19 2003 - 01:06:25 EST


Hi All

I have another test case that is showing this behavior.
It is very similar to George's, however, it is a simplification
of another one using threads that a co-worker, Adam Li found
a few days ago.

Parent (FIFO 5) forks child that sets itself to FIFO 4 and
busy loops, then it sleeps five seconds and kills the child.

Doing SysRq + T after a while shows the parent'd call trace
to be at sys_rt_sigaction+0xd1, that is just inside the final
copy_to_user() in signal.c:sys_rt_sigaction().

Reprioritizing events/0 to FIFO 5+ fixes the inversion.

If I call nanosleep directly (with system() instead of
glibc's sleep(), so I avoid all the rt_sig calls),
I get the parent process always stuck in work_resched+0x5,
in entry.S:work_resched, just after the call to the
scheduler - however, I cannot trace what is happening
inside the scheduler.

My point here is: I am trying to trace where this program
is making use of workqueues inside of the kernel, and I
can find none. The only place where I need to look some
more is inside the timer code, but in a quick glance,
it seems it is not being used, so why is it affected by
the reprioritization of the events/0 thread? George, can
you help me here?

kernel is 2.5.67, SMP and PREEMPT with maxcpus=1; tomorrow
I will try .72 ...

Iñaky Pérez-González -- Not speaking for Intel -- all opinions are my own
(and my fault)



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Jun 23 2003 - 22:00:28 EST