I am finding that the raid0 is generating an oops (repeatably)
when I try to copy a large (>1Gb) file off the md device.
I orginally found this problem in the postgresql backend in
trying to back up a large database but found that cp will do
the same.
This occurs for 2.2.11 and 2.2.12. I am running a Debian 2.1
system on a dual Xeon box (Intel board with 450 Mhz processors,
256Mb, Adaptec 294XU2.
I have build a second raid0 file system with different
disks and found the same problem there as well. It is not
file dependent in the sense that the failure occurs for different
over several hundred MB. Smaller files are ok.
------------------------------------------------------------------------------
The ksymoops output for failure in the cp command under 2.2.11 is as follows:
Options used: -V (default)
-o /lib/modules/2.2.10/ (default)
-k /proc/ksyms (default)
-l /proc/modules (default)
-m /boot/System.map-2.2.10 (specified)
-c 1 (default)
Sep 29 13:07:04 hawk kernel: Oops: 0000
Sep 29 13:07:04 hawk kernel: CPU: 0
Sep 29 13:07:04 hawk kernel: EIP: 0010:[raid0_map+166/316]
Sep 29 13:07:04 hawk kernel: EFLAGS: 00010212
Sep 29 13:07:04 hawk kernel: eax: 0171b189 ebx: d08a0050 ecx: 00000040
edx: 00000002
Sep 29 13:07:04 hawk kernel: esi: 24448b53 edi: 2e363120 ebp: 00000020
esp: c60e7dd8
Sep 29 13:07:04 hawk kernel: ds: 0018 es: 0018 ss: 0018
Sep 29 13:07:04 hawk kernel: Process cp (pid: 274, process nr: 45,
stackpage=c60e7000)
Sep 29 13:07:04 hawk kernel: Stack: ce4a0a2e 00000009 00000286 d089e000
5c6c6240 d089c000 00000080 00000009
Sep 29 13:07:04 hawk kernel: 00000003 c018d376 c026d844 ce4a0a2e
ce4a0a30 00000002 00000002 c60e7e80
Sep 29 13:07:04 hawk kernel: 00000004 c018845d 00000000 ce4a0a2e
ce4a0a30 00000002 00000000 20363838
Sep 29 13:07:04 hawk kernel: Call Trace: [eepro100:eepro100_init+-41036/8500]
[eepro100:eepro100_init+-49228/8500] [md_map+94/104] [ll_rw_block+233/524]
[brw_page+700/944] [generic_readpage+127/140] [try_to_read_ahead+271/296]
Sep 29 13:07:04 hawk kernel: Code: 8b 46 08 03 06 39 c7 7c 27 8b 5b 04 85 db
75 1e 57 68 eb ef
Code: 00000000 Before first symbol 00000000 <_IP>: <===
Code: 00000000 Before first symbol 0: 8b 46 08 movl
0x8(%esi),%eax <===
Code: 00000003 Before first symbol 3: 03 06 addl
(%esi),%eax
Code: 00000005 Before first symbol 5: 39 c7 cmpl
%eax,%edi
Code: 00000007 Before first symbol 7: 7c 27 jl
00000030 Before first symbol
Code: 00000009 Before first symbol 9: 8b 5b 04 movl
0x4(%ebx),%ebx
Code: 0000000c Before first symbol c: 85 db testl
%ebx,%ebx
Code: 0000000e Before first symbol e: 75 1e jne
0000002e Before first symbol
Code: 00000010 Before first symbol 10: 57 pushl
%edi
Code: 00000011 Before first symbol 11: 68 eb ef 00 00 pushl
$0xefeb
------------------------------------------------------------------------------
Before the oops, dmesg also reports:
Unable to handle kernel paging request at virtual address 24448b5b
current->tss.cr3 = 05e88000, %cr3 = 05e88000
*pde = 00000000
------------------------------------------------------------------------------
I will be happy to supply any additional things that are needed.
If this is known (and esp. if there is a fix), please let me know.
Thanks,
--Martin
===========================================================================
Martin Weinberg Phone: (413) 545-3821
Dept. of Physics and Astronomy FAX: (413) 545-2117/0648
530 Graduate Research Tower
University of Massachusetts
Amherst, MA 01003-4525
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/