Of locks, spinlocks and printks in schedule()

Hugo Varotto (hvarotto@cs.pitt.edu)
Tue, 27 Jul 1999 12:31:01 -0400


Hello,

a couple of weeks ago I sent an e-mail to the list explaining that I was
trying to implement a resource allocation mechanism for multiprocessor
and needed to know if there's a way to know which process was current in
a CPU. Thanks to all who answered, in particular Rick van Riel who
explained me that the functionality was already in 2.2.10 ( I was using
an older version 2.2.1, so I wasn't aware of it ).

So, I started porting my work to 2.2.10 ( in preparation for the time in
which I need to use the current process in a particular CPU ), but found
that I was getting constantly strange hangups ( of course after applying
my modifications ). So, in order to trace the source of error, I started
disabling all my modifications, and found finally a strange behaviour.
For doing a trace of the behaviour of my modifications, I modified the
task struct and added a field ( is_rk ) so I could trace particular
tasks while inside the schedule() routine and others ( if is_rk ==1 then
printk whatever ). Now, I don't know why, but it seems that if I putt a
printk in the schedule() routine, between the pair of

spin_lock_irq(&runqueue_lock);
...
spin_unlock_irq(&runqueue_lock);

commands, the kernel crashes ( or actually, to be more precise, it
freezes ). The more strange thing is that sometimes it does it in the
first pass, and sometimes it need a couple of passes thhrough.

I checked the code of printk() and although it has a couple of spinlock
also, they're done against othe variable( &console_lock ).
I had no problem doing this with 2.2.1, so it took me sometime to
realize this. Does anybody ever experienced this or have any idea why it
could be happening ? I know that 2.2.10 has "slab poisoning" but I don't
think it should be affecting this.
Again, in order to trace this the only modifictions that I did was add a
field to the task struct in sched.h, and add a ssycall to modify the
value of it.

Regarding the lock question, it seems that the schedule() routine
changed heavily from 2.2.1 to 2.2.10 ( yes, I should be following more
close the releases, but my advisor was initially against the idea of
constantly porting modifications in successive kernels ). In particular,
the locking sequence has changed ( no scheduler_lock ) but there's
something else that puzzles mem adn that is the use of the tasklist_lock
variable. In the schedule() routine, when it's needed to do a
recalculation of the counter associated with a task, first the
runqueue_lock is dropped, then the tasklist_lock is get ( in read mode )
later on is dropped, and then we get again the runqueue_lock.

I think that this is done to be able to have sort of a parallel
execution inside the schedule() routine ( while a CPU is calculating
counters, the other could be doing a task selection ). However, if I
don't remember bad my theory classes, a lock is accessed in a "read"
mode only if we want to access the contents, not to modify them. The
semantics are that if a lock is accessed in read mode, and another CPU
wants to access it in read mode also, it can do it. However, if it's
wants to access it in write mode and it's already accessed in read
mode, it should wait until it is released ( and all the other
combinations of reader-writer apply ). I checked the code and it seems
that the tasklist_lock is only accessed in write mode when the task is
created or when it exits ( 'cause we need to modify the table structure
). Shouldn't it also be accessed in write mode in schedule() when it's
updating the task table counters ? I think that by doing a read mode
lock there's less overhead inside the schedule routine, which is
desirable, but however it seems not to be completely correct.

Again, as I said, there must be a reason for doing this, so please
correct me if I'm making a mistake in my analysis.

Uff, this is longer than I expected, thanks for reading this mail, and
thanks in advance for your answers

Hugo

--
Hugo Varotto
Computer Science Dept.
University of Pittsburgh
hvarotto@cs.pitt.edu
http://www.cs.pitt.edu/FORTS

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/