Re: [uClinux-dev] Re: dm9000 patch

From: Greg Ungerer
Date: Wed Dec 29 2010 - 19:49:47 EST


Hi Angelo,

On 30/12/10 06:57, Angelo Dureghello wrote:
Hi all,
thanks for the help,
the kernel is a main line kernel. Then yes, i am still using uclinux
tree for libc/tools.

How is the DM9000 hardware connected to the 5307?
I am wondering how you connected the interrupt (and to
which interrupt) and the addressing (direct of a chip select)?

(For example NETtel based 5307 platform support of the SMC91x code is
in mainline as arch/m68knommu/platform/5307/nettel.c). Can you show
the code you used to setup your dm9000 hardware?
(Specifically I guess I want to know if you use the "auto-vectored"
interrupt mode?)

Thanks
Greg


I collected another spinlock recursion with a slightly different call
stack trace, as always, the spinlock recursion issue happen on a high
tx/rx traffic of the dm9000e, in this case just asking an index.html
with some images and texts:

[ 1108.930000] BUG: spinlock recursion on CPU#0, httpd/29
[ 1108.930000] lock: 00c42c06, .magic: dead4ead, .owner: httpd/29,
.owner_cpu: 0
[ 1108.930000] Stack from 00d7a688:
[ 1108.930000] 00d7a6b4 000ad988 001840ca 00c42c06 dead4ead 00d641d4
0000001d 00000000
[ 1108.930000] 00c42c06 000064f0 00c42800 00d7a6e8 000adb5a 00c42c06
00184130 00002704
[ 1108.930000] 00000000 0000001f 0014d17e 00159912 00c42b60 000064f0
00c42800 0002cb16
[ 1108.930000] 00d7a6f8 0014d24e 00c42c06 00000000 00d7a738 000e485c
00c42c06 00000000
[ 1108.930000] 00000000 0000001f 0014d17e 00159912 0000004a 00cfc600
000064f0 00009a74
[ 1108.930000] 0002cb16 00191204 00d7a760 0002b6f2 00d7a760 0002b514
0000001f 00c42800
[ 1108.930000] Call Trace:
[ 1108.930000] [000ad988] spin_bug+0x86/0x11a
[ 1108.930000] [000adb5a] do_raw_spin_lock+0x58/0x120
[ 1108.930000] [0014d24e] _raw_spin_lock_irqsave+0x28/0x32
[ 1108.930000] [000e485c] dm9000_interrupt+0x1a/0x2e0
[ 1108.930000] [0002b514] handle_IRQ_event+0x2a/0xec
[ 1108.930000] [0002b680] __do_IRQ+0xaa/0x128
[ 1108.930000] [00000bb6] do_IRQ+0x48/0x62
[ 1108.930000] [000033c6] inthandler+0x6a/0x74
[ 1108.930000] [000fb626] dev_hard_start_xmit+0x170/0x4c4
[ 1108.930000] [0010b80e] sch_direct_xmit+0xc0/0x1bc
[ 1108.930000] [000fe9de] dev_queue_xmit+0x160/0x3e6
[ 1108.930000] [001195c4] ip_finish_output+0xec/0x320
[ 1108.930000] [0011a768] ip_output+0x9e/0xa8
[ 1108.930000] [00119856] ip_local_out+0x26/0x30
[ 1108.930000] [0011a56e] ip_build_and_send_pkt+0x16e/0x178
[ 1108.930000] [0012fc96] tcp_v4_send_synack+0x52/0x90
[ 1108.930000] [00130f86] tcp_v4_conn_request+0x3fa/0x57c
[ 1108.930000] [0012a1c6] tcp_rcv_state_process+0x25e/0xa66
[ 1108.930000] [001309a4] tcp_v4_do_rcv+0x7c/0x1c8
[ 1108.930000] [00132854] tcp_v4_rcv+0x546/0x6d2
[ 1108.930000] [001153a8] ip_local_deliver+0x9c/0x1b0
[ 1108.930000] [001158e8] ip_rcv+0x42c/0x5f0
[ 1108.930000] [000fa74e] __netif_receive_skb+0x196/0x2ec
[ 1108.930000] [000fe142] process_backlog+0x72/0x11e
[ 1108.930000] [000fe290] net_rx_action+0xa2/0x150
[ 1108.930000] [0000e13c] __do_softirq+0x74/0xe4
[ 1108.930000] [0000e1e2] do_softirq+0x36/0x40
[ 1108.930000] [0000e6c6] local_bh_enable+0x7a/0xa4
[ 1108.930000] [000fe972] dev_queue_xmit+0xf4/0x3e6
[ 1108.930000] [001195c4] ip_finish_output+0xec/0x320
[ 1108.930000] [0011a768] ip_output+0x9e/0xa8
[ 1108.930000] [00119856] ip_local_out+0x26/0x30
[ 1108.930000] [0011a90a] ip_queue_xmit+0x198/0x426
[ 1108.930000] [0012bcc8] tcp_transmit_skb+0x3f0/0x76c
[ 1108.930000] [0012cfda] tcp_write_xmit+0x178/0x868
[ 1108.930000] [0012d6f8] __tcp_push_pending_frames+0x2e/0x9a
[ 1108.930000] [001222be] tcp_sendmsg+0x82e/0x98c
[ 1108.930000] [0013d9c0] inet_sendmsg+0x32/0x54
[ 1108.930000] [000ec25e] sock_aio_write+0xc8/0x138
[ 1108.930000] [00043e7e] do_sync_write+0x9e/0xfe
[ 1108.930000] [00043f56] vfs_write+0x78/0x84
[ 1108.930000] [0004446c] sys_write+0x40/0x7a
[ 1108.930000] [00003244] system_call+0x84/0xc2
[ 1108.930000]

seems like while i transmit a packet, dm9000_interrupt try to acquire
the spinlock owned from the same task.

Compiling the kernel i am getting:
CC kernel/irq/handle.o
kernel/irq/handle.c:432:3: warning: #warning __do_IRQ is deprecated.
Please convert to proper flow handlers

Could the usage of __do_IRQ super-handler be a cause of the issue ?


many thanks,
angelo

On 29/12/2010 19:45, Geert Uytterhoeven wrote:
On Wed, Dec 29, 2010 at 19:06, Baruch Siach<baruch@xxxxxxxxxx> wrote:
Hi Angelo,

On Wed, Dec 29, 2010 at 02:13:22PM +0100, Angelo Dureghello wrote:
just FYI, i tested kernel 2.6.36.2, unfortunately the issue is still
there, below the call stack trace.
Help from the m68k experts seems to be needed. Adding the relevant
list to Cc.
This is uClinux? Added Cc...

[ 4.620000] eth0: link up, 100Mbps, full-duplex, lpa 0x45E1
[ 39.390000] BUG: spinlock recursion on CPU#0, httpd/29
[ 39.390000] lock: 00189c44, .magic: dead4ead, .owner: httpd/29,
.owner_cpu: 0
[ 39.390000] Stack from 00d6a990:
[ 39.390000] 00d6a9bc 000a9710 0017cac7 00189c44 dead4ead
00de48f4 0000001d 00000000
[ 39.390000] 00189c44 0002a646 00145f70 00d6a9f0 000a98e2
00189c44 0017cb2d 00189c44
[ 39.390000] 00d6aad8 0000001f 00145f5c 001523f6 00189c08
0002a646 00145f70 0002bc52
[ 39.390000] 00d6a9fc 00145f7e 00189c44 00d6aa28 0002a75e
00189c44 0000001f 00d6aad8
[ 39.390000] 0000001f 00145f5c 00189c08 0002a646 00145f70
0002bc52 00d6aa3c 00000bb6
[ 39.390000] 0000001f 00189c44 00cfc780 00d6aa84 0000337a
0000001f 00d6aa4c 00000001
[ 39.390000] Call Trace:
[ 39.390000] [000a9710] spin_bug+0x86/0x11a
[ 39.390000] [000a98e2] do_raw_spin_lock+0x58/0x120
[ 39.390000] [00145f7e] _raw_spin_lock+0xe/0x14
[ 39.390000] [0002a75e] __do_IRQ+0x2c/0x108
[ 39.390000] [00000bb6] do_IRQ+0x48/0x62
[ 39.390000] [0000337a] inthandler+0x6a/0x74
[ 39.390000] [0002a82e] __do_IRQ+0xfc/0x108
[ 39.390000] [00000bb6] do_IRQ+0x48/0x62
[ 39.390000] [0000337a] inthandler+0x6a/0x74
[ 39.390000] [000ef0ce] skb_release_all+0x10/0x20
[ 39.390000] [000ee6bc] __kfree_skb+0x10/0x92
[ 39.390000] [000ee75e] consume_skb+0x20/0x34
[ 39.390000] [000e004e] dm9000_start_xmit+0xdc/0xec
[ 39.390000] [000f67a2] dev_hard_start_xmit+0x146/0x472
[ 39.390000] [00106506] sch_direct_xmit+0xc0/0x1bc
[ 39.390000] [000f9914] dev_queue_xmit+0x160/0x3e4
[ 39.390000] [00113b3e] ip_finish_output+0xee/0x318
[ 39.390000] [001142b4] ip_output+0x7c/0x88
[ 39.390000] [00113dc6] ip_local_out+0x26/0x30
[ 39.390000] [00114d9a] ip_queue_xmit+0x152/0x374
[ 39.390000] [00125c8c] tcp_transmit_skb+0x3f0/0x732
[ 39.390000] [00126f26] tcp_write_xmit+0x178/0x868
[ 39.390000] [00127644] __tcp_push_pending_frames+0x2e/0x9a
[ 39.390000] [0011c3d6] tcp_sendmsg+0x82e/0x98c
[ 39.390000] [00137544] inet_sendmsg+0x32/0x54
[ 39.390000] [000e79a6] sock_aio_write+0xc8/0x138
[ 39.390000] [00042590] do_sync_write+0x9e/0xfe
[ 39.390000] [00042668] vfs_write+0x78/0x84
[ 39.390000] [00042a92] sys_write+0x40/0x7a
[ 39.390000] [00003224] system_call+0x84/0xc2
[ 39.390000]

dm9000e is as default not visible/selectable in menuconfig for
Coldfire architectures, so this probably cannot be considered as a
kernel bug.

I going forward in investigations, every help is appreciated,

regards,
angelo



On 29/12/2010 07:06, Baruch Siach wrote:
Hi Angelo,

On Tue, Dec 28, 2010 at 10:52:42PM +0100, Angelo Dureghello wrote:
sorry to contact you directly but i couldn't get any help from the
kernel.org mailing list, since i am not a developer my mails are
generally skipped.
The best way to get the contact info for a piece of kernel code, is
using the
get_maintainer.pl script. Running 'scripts/get_maintainer.pl -f
drivers/net/dm9000.c' gives the following output:

netdev@xxxxxxxxxxxxxxx
linux-kernel@xxxxxxxxxxxxxxx

I added both to Cc.

I am very near to have a custom board working with MCF5307 cpu and
dm9000.
I am using kernel 2.6.36-rc3 with your last patch about
spinlock-recursion already included.
You should try to update to the latest .36 kernel, which is currently
2.6.36.2. The problem that you experience might be unrelated to the
dm9000
driver (or to networking at all), so it might have been fixed in
this version.

I have "ping" and "telnet" to the embedded board fully working.
If i try to get a sample web page with some images from the board
httpd with a browser, in 80% of cases i get a trap/oops:
Try to enable KALLSYMS in your kernel .config to make your stack
trace more
meaningful. This is under 'General setup -> Configure standard
kernel features
(for small systems) -> Load all symbols for debugging/ksymoops'.

I hope this helps.

baruch

[ 4.590000] eth0: link up, 100Mbps, full-duplex, lpa 0x45E1
[ 67.630000] BUG: spinlock recursion on CPU#0, httpd/29
[ 67.630000] lock: 00c42c06, .magic: dead4ead, .owner: httpd/29,
.owner_cpu: 0
[ 67.630000] Stack from 00d7b914:
[ 67.630000] 00d7b940 000a8cf0 0015f693 00c42c06 dead4ead
00dec1d4 0000001d 00000000
[ 67.630000] 00c42c06 00006188 00c42800 00d7b974 000a8ec2
00c42c06 0015f6f9 00002704
[ 67.630000] 00000000 0000001f 00146fa4 00152f0c 00c42b60
00006188 00c42800 0002b312
[ 67.630000] 00d7b984 0014701e 00c42c06 00000000 00d7b9c4
000df21c 00c42c06 00000000
[ 67.630000] 00000000 0000001f 00146fa4 00152f0c 000005ea
00cfc640 00006188 000096e8
[ 67.630000] 0002b312 00146fa4 00c42b60 00002704 00d7b9ec
00029d3a 0000001f 00c42800
[ 67.630000] Call Trace:
[ 67.630000] [000a8cf0] [000a8ec2] [0014701e] [000df21c] [00029d3a]
[ 67.630000] [00029e84] [00000bb6] [0000336e] [000df162] [000effd6]
[ 67.630000] [00100482] [000f312e] [000f9ebc] [0010dd2a] [0010e4a0]
[ 67.630000] [0010dfb2] [0010ef80] [0011fed6] [00121170] [0012188e]
[ 67.630000] [0011ecc6] [001249fe] [000e4084] [0011621c] [00131a44]
[ 67.630000] [000e11ee] [00041944] [00041a1c] [00041e46] [00003218]
[ 67.630000] BUG: spinlock lockup on CPU#0, httpd/29, 00c42c06
[ 67.630000] Stack from 00d7b934:
[ 67.630000] 00d7b974 000a8f66 0015f703 00000000 00dec1d4
0000001d 00c42c06 00002704
[ 67.630000] 00000000 0000001f 00146fa4 00152f0c 00c42b60
00006188 00c42800 0002b312
[ 67.630000] 00d7b984 0014701e 00c42c06 00000000 00d7b9c4
000df21c 00c42c06 00000000
[ 67.630000] 00000000 0000001f 00146fa4 00152f0c 000005ea
00cfc640 00006188 000096e8
[ 67.630000] 0002b312 00146fa4 00c42b60 00002704 00d7b9ec
00029d3a 0000001f 00c42800
[ 67.630000] 0016c1b4 00cfc640 0000001f 0016c178 00029d10
00146fb8 00d7ba20 00029e84
[ 67.630000] Call Trace:
[ 67.630000] [000a8f66] [0014701e] [000df21c] [00029d3a] [00029e84]
[ 67.630000] [00000bb6] [0000336e] [000df162] [000effd6] [00100482]
[ 67.630000] [000f312e] [000f9ebc] [0010dd2a] [0010e4a0] [0010dfb2]
[ 67.630000] [0010ef80] [0011fed6] [00121170] [0012188e] [0011ecc6]
[ 67.630000] [001249fe] [000e4084] [0011621c] [00131a44] [000e11ee]
[ 67.630000] [00041944] [00041a1c] [00041e46] [00003218]

As i said, i was hoping in your patch but i sadly discovered it is
already included in this kernel version.
Hope you can give me some help or can forward me to an appropriate
mailing list.
--
~. .~ Tk Open Systems
=}------------------------------------------------ooO--U--Ooo------------{=

- baruch@xxxxxxxxxx - tel: +972.2.679.5364, http://www.tkos.co.il -
--
To unsubscribe from this list: send the line "unsubscribe
linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/




_______________________________________________
uClinux-dev mailing list
uClinux-dev@xxxxxxxxxxx
http://mailman.uclinux.org/mailman/listinfo/uclinux-dev
This message was resent by uclinux-dev@xxxxxxxxxxx
To unsubscribe see:
http://mailman.uclinux.org/mailman/options/uclinux-dev



--
------------------------------------------------------------------------
Greg Ungerer -- Principal Engineer EMAIL: gerg@xxxxxxxxxxxx
SnapGear Group, McAfee PHONE: +61 7 3435 2888
8 Gardner Close FAX: +61 7 3217 5323
Milton, QLD, 4064, Australia WEB: http://www.SnapGear.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/