Re: 2.6.26.3-rt3

From: John Kacur
Date: Fri Aug 22 2008 - 19:40:08 EST


On Fri, Aug 22, 2008 at 11:39 PM, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> We are pleased to announce the 2.6.26.3-rt3 tree, which can be
> downloaded from the location:
>
> http://rt.et.redhat.com/download/
>
> Information on the RT patch can be found at:
>
> http://rt.wiki.kernel.org/index.php/Main_Page
>
> Changes since 2.6.26-rt2
>
> - patch merge fix (Steven Rostedt)
>
> - fix net core sock locking (Chirag Jog)

Actually Peter Zijlstra. (Chirag was just first in the email thread)

>
> - namespace lock fixes (Chirag Jog)
>
> - hrtimers stuck in waitqueue fix (Thomas Gleixner)
>
>
> to build a 2.6.26.3-rt3 tree, the following patches should be applied:
>
> http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.26.tar.bz2
> http://kernel.org/pub/linux/kernel/v2.6/patch-2.6.26.3.bz2
> http://rt.et.redhat.com/download/patch-2.6.26.3-rt3.bz2
>
>
>
> And like always, my RT version of Matt Mackall's ketchup will get this
> for you nicely:
>
> http://people.redhat.com/srostedt/rt/tools/ketchup-0.9.8-rt3
>
>
> The broken out patches are also available.
>

One more patch that was missed - it was discussed here
http://marc.info/?l=linux-rt-users&m=121846031913931&w=2

I am resending it, please consider for -rt4.
Without it I continue to get the following type of message.

BUG: using smp_processor_id() in preemptible [00000000] code: firefox-bin/3912
caller is __qdisc_run+0x160/0x1e9
Pid: 3912, comm: firefox-bin Tainted: G W 2.6.26.3-rt2 #6

Call Trace:
[<ffffffff8033cc96>] debug_smp_processor_id+0xde/0xec
[<ffffffff803f9a87>] __qdisc_run+0x160/0x1e9
[<ffffffff803e8777>] dev_queue_xmit+0x1b3/0x2ee
[<ffffffff8040df1e>] ip_finish_output+0x29b/0x2e4
[<ffffffff8040e04a>] ip_output+0xe3/0xec
[<ffffffff8040cf9c>] ip_local_out+0x25/0x29
[<ffffffff8040d80e>] ip_queue_xmit+0x2ce/0x35e
[<ffffffff804215a6>] ? __tcp_push_pending_frames+0x74a/0x860
[<ffffffff804215a6>] ? __tcp_push_pending_frames+0x74a/0x860
[<ffffffff8027f844>] ? trace_preempt_on+0x1f/0xf9
[<ffffffff8041e9de>] ? tcp_transmit_skb+0x72a/0x78f
[<ffffffff804215a6>] ? __tcp_push_pending_frames+0x74a/0x860
[<ffffffff8041ea04>] tcp_transmit_skb+0x750/0x78f
[<ffffffff802ac050>] ? kmem_cache_alloc_node+0x11e/0x145
[<ffffffff804215a6>] __tcp_push_pending_frames+0x74a/0x860
[<ffffffff803e2c55>] ? __alloc_skb+0x70/0x136
[<ffffffff804158f1>] tcp_sendmsg+0x941/0xa5f
[<ffffffff802c01f3>] ? __pollwait+0x0/0xe5
[<ffffffff803dbd3e>] sock_sendmsg+0x102/0x125
[<ffffffff8024c3b3>] ? autoremove_wake_function+0x0/0x3d
[<ffffffff80281a97>] ? tracing_hist_preempt_stop+0x2cb/0x2f5
[<ffffffff80272624>] ? __rcu_read_unlock+0x93/0xa7
[<ffffffff802b250e>] ? fget_light+0x97/0xad
[<ffffffff803dc8ee>] sys_sendto+0xe4/0x10c
[<ffffffff8024c3b3>] ? autoremove_wake_function+0x0/0x3d
[<ffffffff8021251c>] ? native_sched_clock+0x2a/0x72
[<ffffffff803dc92a>] sys_send+0x14/0x16
[<ffffffff803f7dbb>] compat_sys_socketcall+0xd2/0x16c
[<ffffffff80224a17>] sysenter_do_call+0x8c/0x149
[<ffffffff8045f4ec>] ? trace_hardirqs_on_thunk+0x3a/0x3c

---------------------------
| preempt count: 00000001 ]
| 1-level deep critical section nesting:
----------------------------------------
.. [<ffffffff8033cc43>] .... debug_smp_processor_id+0x8b/0xec
.....[<ffffffff803f9a87>] .. ( <= __qdisc_run+0x160/0x1e9)

BUG: firefox-bin:3912 task might have lost a preemption check!
Pid: 3912, comm: firefox-bin Tainted: G W 2.6.26.3-rt2 #6

Call Trace:
[<ffffffff80462e79>] ? sub_preempt_count+0xd1/0xe6
[<ffffffff80233bb1>] preempt_enable_no_resched+0x5c/0x5e
[<ffffffff8033cc9b>] debug_smp_processor_id+0xe3/0xec
[<ffffffff803f9a87>] __qdisc_run+0x160/0x1e9
[<ffffffff803e8777>] dev_queue_xmit+0x1b3/0x2ee
[<ffffffff8040df1e>] ip_finish_output+0x29b/0x2e4
[<ffffffff8040e04a>] ip_output+0xe3/0xec
[<ffffffff8040cf9c>] ip_local_out+0x25/0x29
[<ffffffff8040d80e>] ip_queue_xmit+0x2ce/0x35e
[<ffffffff804215a6>] ? __tcp_push_pending_frames+0x74a/0x860
[<ffffffff804215a6>] ? __tcp_push_pending_frames+0x74a/0x860
[<ffffffff8027f844>] ? trace_preempt_on+0x1f/0xf9
[<ffffffff8041e9de>] ? tcp_transmit_skb+0x72a/0x78f
[<ffffffff804215a6>] ? __tcp_push_pending_frames+0x74a/0x860
[<ffffffff8041ea04>] tcp_transmit_skb+0x750/0x78f
[<ffffffff802ac050>] ? kmem_cache_alloc_node+0x11e/0x145
[<ffffffff804215a6>] __tcp_push_pending_frames+0x74a/0x860
[<ffffffff803e2c55>] ? __alloc_skb+0x70/0x136
[<ffffffff804158f1>] tcp_sendmsg+0x941/0xa5f
[<ffffffff802c01f3>] ? __pollwait+0x0/0xe5
[<ffffffff803dbd3e>] sock_sendmsg+0x102/0x125
[<ffffffff8024c3b3>] ? autoremove_wake_function+0x0/0x3d
[<ffffffff80281a97>] ? tracing_hist_preempt_stop+0x2cb/0x2f5
[<ffffffff80272624>] ? __rcu_read_unlock+0x93/0xa7
[<ffffffff802b250e>] ? fget_light+0x97/0xad
[<ffffffff803dc8ee>] sys_sendto+0xe4/0x10c
[<ffffffff8024c3b3>] ? autoremove_wake_function+0x0/0x3d
[<ffffffff8021251c>] ? native_sched_clock+0x2a/0x72
[<ffffffff803dc92a>] sys_send+0x14/0x16
[<ffffffff803f7dbb>] compat_sys_socketcall+0xd2/0x16c
[<ffffffff80224a17>] sysenter_do_call+0x8c/0x149
[<ffffffff8045f4ec>] ? trace_hardirqs_on_thunk+0x3a/0x3c

---------------------------
| preempt count: 00000000 ]
| 0-level deep critical section nesting:
----------------------------------------


Thank You.
Subject: fix for BUG: using smp_processor_id() in preemptible code

Fixes using smp_processor_id() in preemptible code as seen when __qdisc_run
calls qdisc_restart which calls handle_dev_cpu_collision

This is fixed by disabling irqs (and preemption) around cpu_collision++
in handle_dev_cpu_collision

Signed-off-by: John Kacur <jkacur at gmail dot com>

Index: linux-2.6.26.1-rt1.jk/net/sched/sch_generic.c
===================================================================
--- linux-2.6.26.1-rt1.jk.orig/net/sched/sch_generic.c
+++ linux-2.6.26.1-rt1.jk/net/sched/sch_generic.c
@@ -94,6 +94,7 @@ static inline int handle_dev_cpu_collisi
struct Qdisc *q)
{
int ret;
+ unsigned long flags;

if (unlikely(dev->xmit_lock_owner == (void *)current)) {
/*
@@ -112,7 +113,9 @@ static inline int handle_dev_cpu_collisi
* Another cpu is holding lock, requeue & delay xmits for
* some time.
*/
+ local_irq_save(flags);
__get_cpu_var(netdev_rx_stat).cpu_collision++;
+ local_irq_restore(flags);
ret = dev_requeue_skb(skb, dev, q);
}