[PATCH] Avoid calling down_read and down_write during startup

From: Alan Stern
Date: Thu Feb 23 2006 - 17:34:54 EST


This patch (as660) changes the registration and unregistration routines
for blocking notifier chains. During system startup, when task switching
is illegal, the routines will avoid calling down_write().

Signed-off-by: Alan Stern <stern@xxxxxxxxxxxxxxxxxxx>

---

On Thu, 23 Feb 2006, Andrew Morton wrote:

> Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote:
> >
> > The calls to register_cpu_notifier are harder. That chain really does
> > need to be blocking
>
> Why?

Because its callouts invoke blocking functions. For example,
drivers/base/topology.c:topology_cpu_callback() calls
sysfs_create_group(). In drivers/cpufreq/cpufreq.c,
cpufreq_cpu_callback() calls cpufreq_add_dev(), which does kzalloc with
GFP_KERNEL.

> > which means we can't avoid calling down_write. The
> > only solution I can think of is to use down_write_trylock in the
> > blocking_notifier_chain_register and unregister routines, even though
> > doing that is a crock.
> >
> > Or else change __down_read and __down_write to use spin_lock_irqsave
> > instead of spin_lock_irq. What do you think would be best?
>
> Nothing's pretty. Perhaps look at system_state and not do any locking at all
> in early boot?

Okay. Here's a patch to do that. It only affects the registration
routines; the actual call-out function still acquires the lock. That's
because (1) I didn't want to add extra overhead to a frequently-used
routine, and (2) if this notifier chain gets called while interrupts are
disabled then there's something badly wrong.

Combined with the previous patch, maybe you'll find that everything works
perfectly now... :-)

Please revert those two debugging patches (the one I sent you and the one
you wrote) before applying this.

Alan Stern

Index: usb-2.6/kernel/sys.c
===================================================================
--- usb-2.6.orig/kernel/sys.c
+++ usb-2.6/kernel/sys.c
@@ -249,6 +249,14 @@ int blocking_notifier_chain_register(str
{
int ret;

+ /*
+ * This code gets used during boot-up, when task switching is
+ * not yet working and interrupts must remain disabled. At
+ * such times we must not call down_write().
+ */
+ if (unlikely(system_state == SYSTEM_BOOTING))
+ return notifier_chain_register(&nh->head, n);
+
down_write(&nh->rwsem);
ret = notifier_chain_register(&nh->head, n);
up_write(&nh->rwsem);
@@ -272,6 +280,14 @@ int blocking_notifier_chain_unregister(s
{
int ret;

+ /*
+ * This code gets used during boot-up, when task switching is
+ * not yet working and interrupts must remain disabled. At
+ * such times we must not call down_write().
+ */
+ if (unlikely(system_state == SYSTEM_BOOTING))
+ return notifier_chain_unregister(&nh->head, n);
+
down_write(&nh->rwsem);
ret = notifier_chain_unregister(&nh->head, n);
up_write(&nh->rwsem);

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/