Re: [PATCH] 2.6.17-rt1 : fix x86_64 oops

From: Paul E. McKenney
Date: Thu Jul 06 2006 - 16:04:24 EST


On Tue, Jul 04, 2006 at 08:50:24AM +0200, Ingo Molnar wrote:
>
> * Ingo Molnar <mingo@xxxxxxx> wrote:
>
> > > Ingo, do you have a suspect ?
> >
> > I suspect it's the patch below. That patch (from John) relaxes the
> > affinities of IRQ threads: if there are /proc/irq/*/smp_affinity
> > entries that have multiple bits set an IRQ thread is allowed to jump
> > from one CPU to another while it is executing a IRQ-handler. It
> > _should_ be fine but i'd not be surprised if that caused breakage ...
>
> the patch below is against 2.6.17-rt5, does this solve the crashes?

And I also still get a segmentation fault when modprobing rcutorture
with this patch. :-/

The segfault is bizarre -- it is in the module symbol-lookup code. I get
the segfault regardless of what module parameters I specify, in fact,
it even shows up even if I comment out all the module parameters in
rcutorture.c. Tiny modules, such as hello.c from "Linux Device Drivers",
work just fine -- but the size of rcutorture.ko is -way- below the
64k limit, and, as I mentioned, I get the oops even if I comment out
all of rcutorture's module parameters.

See oops below.

Any enlightenment available?

Thanx, Paul

BUG: unable to handle kernel paging request at virtual address 75010000
printing eip:
c0133941
*pde = 00000000
Oops: 0000 [#1]
PREEMPT SMP
Modules linked in:
CPU: 0
EIP: 0060:[<c0133941>] Not tainted VLI
EFLAGS: 00010297 (2.6.17-rt5-autokern1 #1)
EIP is at lookup_symbol+0x16/0x3b
eax: ffffffff ebx: c0345a5c ecx: c0343e90 edx: c0343e98
esi: 75010000 edi: f8ded2a6 ebp: f1a61edc esp: f1a61e98
ds: 007b es: 007b ss: 0068 preempt: 00000001
Process modprobe (pid: 3663, threadinfo=f1a60000 task=f12160d0 stack_left=7780 worst_left=-1)
Stack: f8df02d0 f8df1380 f8ded2a6 c013398c f8ded2a6 c03419f0 c0345a5c f8df02d0
f8df1380 0000008e 00000062 c01345b5 f8ded2a6 f1a61ed8 f1a61edc 00000001
00000000 f8dc2e31 f8df02d0 00000620 c0134b49 f8de2214 00000000 f8ded2a6
Call Trace:
[<c013398c>] __find_symbol+0x26/0x158 (16)
[<c01345b5>] resolve_symbol+0x21/0x46 (32)
[<c0134b49>] simplify_symbols+0x88/0x101 (36)
[<c0135707>] load_module+0x5f0/0x913 (40)
[<c0135a8e>] sys_init_module+0x41/0x1c5 (144)
[<c0102a03>] syscall_call+0x7/0xb (24)
---------------------------
| preempt count: 00000001 ]
| 1-level deep critical section nesting:
----------------------------------------
.. [<c02f17f7>] .... _raw_spin_lock_irqsave+0xe/0x35
.....[<00000000>] .. ( <= _stext+0x3feffd64/0x41)

Code: 43 83 c1 28 0f b7 42 30 39 c3 72 c7 31 d2 5b 89 d0 5e 5f 5d c3 57 56 53 8b 5c 24 18 8b 54 24 14 39 da 73 24 8b 72 04 8b 7c 24 10 <ac> ae 75 08 84 c0 75 f8 31 c0 eb 04 19 c0 0c 01 85 c0 89 d1 74
EIP: [<c0133941>] lookup_symbol+0x16/0x3b SS:ESP 0068:f1a61e98

> Ingo
>
> Index: linux-rt.q/kernel/irq/manage.c
> ===================================================================
> --- linux-rt.q.orig/kernel/irq/manage.c
> +++ linux-rt.q/kernel/irq/manage.c
> @@ -645,17 +645,24 @@ extern asmlinkage void __do_softirq(void
>
> static int curr_irq_prio = 49;
>
> -static int do_irqd(void * __desc)
> +static void follow_irq_affinity(struct irq_desc *desc)
> {
> - struct sched_param param = { 0, };
> - struct irq_desc *desc = __desc;
> #ifdef CONFIG_SMP
> - int irq = desc - irq_desc;
> cpumask_t mask;
>
> - mask = cpumask_of_cpu(any_online_cpu(irq_desc[irq].affinity));
> + if (cpus_equal(current->cpus_allowed, desc->affinity))
> + return;
> + mask = cpumask_of_cpu(any_online_cpu(desc->affinity));
> set_cpus_allowed(current, mask);
> #endif
> +}
> +
> +static int do_irqd(void * __desc)
> +{
> + struct sched_param param = { 0, };
> + struct irq_desc *desc = __desc;
> +
> + follow_irq_affinity(desc);
> current->flags |= PF_NOFREEZE | PF_HARDIRQ;
>
> /*
> @@ -674,13 +681,7 @@ static int do_irqd(void * __desc)
> local_irq_disable();
> __do_softirq();
> local_irq_enable();
> -#ifdef CONFIG_SMP
> - /*
> - * Did IRQ affinities change?
> - */
> - if (!cpus_equal(current->cpus_allowed, irq_desc[irq].affinity))
> - set_cpus_allowed(current, irq_desc[irq].affinity);
> -#endif
> + follow_irq_affinity(desc);
> schedule();
> }
> __set_current_state(TASK_RUNNING);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/