Possible bug with the keyboard_tasklet?

From: Ryan Bradetich (rbradetich@uswest.net)
Date: Fri Nov 30 2001 - 01:12:39 EST


Hello linux hackers,

I am a relatively new kernel hacker working on the parisc-linux port,
and I think I found a BUG in the arch independant code involving the
keyboard_tasklet. The problem represented itself as a system hang
when I enabled CONFIG_SMP on my C200+. I am posting to this list, so
hopefully someone can verify the problem and help devise a solution to
fix the problem properly.

Problem Description:
------------------------------------------------------------------------
I tracking the problem down to the following infinate loop in
tasklet_action() from kernel/softirq.c.

while (list) {
        struct tasklet_struct *t = list;

        list = list->next;

        if (tasklet_trylock(t)) {
                if (!atomic_read(&t->count)) {
                        if (!test_and_clear_bit(TASKLET_STATE_SCHED,
&t->state))
                                BUG();
                        t->func(t->data);
                        tasklet_unlock(t);
                        continue;
                }
                tasklet_unlock(t);
        }

        local_irq_disable();
        t->next = tasklet_vec[cpu].list;
        tasklet_vec[cpu].list = t;
        __cpu_raise_softirq(cpu, TASKLET_SOFTIRQ);
        local_irq_enable();
}

I eventually figured out that the if(!atomic_read(&t->count)) was
failing... and the task would be added back into the list via the
following two lines of code:

        t->next = tasklet_vec[cpu].list;
        tasklet_vec[cpu].list = t;

This loop would never end since the atomic_read(&t->count) was always
non-zero, and the task was always being re-added to the list.

Debugging the loop further showed me the keyboard_tasklet was
the culprit task causing the infinate loop. I eventually figured out
that the keyboard_tasklet->count was being initialized to 1 by the
following macro from include/linux/interrupt.h:

        #define DECLARE_TASKLET_DISABLED(name, func, data) \
        struct tasklet_struct name = { NULL, 0, ATOMIC_INIT(1), func,
data }

I also figurd out that the keyboard_tasklet was being scheduled via the
schedule_tasklet() before the enable_tasklet() was called it. (The
enable_tasklet() provides a memory barrior, then calls atomic_dec()
on the ->count of the tasklet, making it 0).

The following traces shows the paths to the first call of
schedule_tasklet(), and the first call of enable_tasklet() for the
keyboard_tasklet.

schedule_tasklet(keyboard_tasklet)
-------------------------
1. start_kernel()
2. console_init()
3. con_init()
4. vc_init()
5. reset_terminal()
6. set_leds()
7. schedule_tasklet()

enable_tasklet(keyboard_tasklet)
--------------------------------
1. start_kernel()
2. rest_init()
3. init() via kernel_thread.
4. do_basic_setup()
5. do_init_calls()
6. chr_dev_init()
7. tty_init()
8. kbd_init()
9. enable_tasklet()

Looking in the start_kernel(), console_init() is the 9th function
called, where as rest_init() is the last function called.

Questions about the code:
------------------------------------------------------------------------
1. Why is the console code scheduling the keyboard_tasklet? This
doesn't make much sense to me.

2. Is the member variable count in the tasklet_struct
(include/linux/interrupt.h) only used for flagging the enabled/disabled
state of the tasklet? If so, is it possible to rename the member
variable to something more intuative like disabled?

3. What is the best way to fix this problem, assuming I actually did
find a bug (I know this fixes the problem I was seeing on my C200+).

Currently I have the following patch in my local tree for this problem:

--- drivers/char/console.c 2001/11/09 23:35:36 1.15
+++ drivers/char/console.c 2001/11/30 04:44:38
@@ -1420,7 +1420,10 @@ static void reset_terminal(int currcons,
        kbd_table[currcons].slockstate = 0;
        kbd_table[currcons].ledmode = LED_SHOW_FLAGS;
        kbd_table[currcons].ledflagstate =
kbd_table[currcons].default_ledflagstate;
- set_leds();
+
+ /* Only schedule the keyboard_tasklet if it is enabled. */
+ if (!atomic_read(&keyboard_tasklet.count))
+ set_leds();
 
        cursor_type = CUR_DEFAULT;
        complement_mask = s_complement_mask;

Thanks for your time,

- Ryan
parisc-linux newbie kernel hacker.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Nov 30 2001 - 21:00:36 EST