Re: question about preempt_disable()

From: George Anzinger
Date: Wed Dec 03 2003 - 18:16:10 EST

Next message: Daniel McNeil: "Re: [PATCH 2.6.0-test9-mm5] aio-dio-fallback-bio_count-race.patch"
Previous message: Chuck Lever: "[PATCH] 2.4 read ahead never reads the last page in a file"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Chris Peterson wrote:

I just bought Robert Love's new book "Linux Kernel Development". The book
has been very informative, but I have some unanswered questions about kernel
preemption.

From what I understand, SMP-safe code is also preempt-safe. The preempt
count is the number of spinlocks held by the current kernel thread. If the
preempt code is greater zero, then the kernel thread cannot be preempted.

My question is: if the code is already SMP-safe and holding the necessary
spinlocks, why is the preempt count necessary? Why must preemption be
disabled and re-enabled as spinlocks are acquired and released? Is it just
an optimization for accessing per-cpu data? Or is it necessary to prevent
priority inversion of kernel threads, when a low priority thread holds
spinlock X and is preempted by a high priority thread that hogs the CPU,
forever spinning in spin_lock(&X)?

There are a couple of things here. First, the preempt count includes other reasons not to preempt. One of these is used when working with per-cpu data and a cpu switch would NOT be good (a pointer would point to the wrong cpu's data). Also, at one time, the bh-lock was also cause to bump the preempt count. In 2.6, I think that is no longer true, BUT, the bh-count is in the same word. This is important as we want to test just one thing at interrupt return time to see if it is ok to preempt. (Well, we actually have to test the need_resched flag too...) From the interrupt codes point of view, the fact that the current task holds one or more spin locks is ONLY visible via the preempt count. Thus the count collects all these things in one, easy to check, place. This allows us to keep the interrupt path as fast as possible.

--
George Anzinger george@xxxxxxxxxx
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Daniel McNeil: "Re: [PATCH 2.6.0-test9-mm5] aio-dio-fallback-bio_count-race.patch"
Previous message: Chuck Lever: "[PATCH] 2.4 read ahead never reads the last page in a file"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]