Re: ext3 / reiserfs data corruption, 2.5-bk

From: Nathan Conrad (conrad@bungled.net)
Date: Tue Jun 10 2003 - 16:44:36 EST


I've been noticing a similar problem on my laptop. This may, or may
not be related, but it did start somewhere within the past week (maybe
the IDE taskfile conversion???, to throw out a guess). I wonder if
Dave Jones is using IDE or SCSI. CONFIG_SMP and CONFIG_PREEMPT are
disabled on my machine (Sony Vaio PCG-FXA49 laptop, Athlon4). I'm
compiling the kernel with gcc 3.3 (Debian version).

Anyway, certain directories get locked up on occasion and when I try
to execute 'ls' or read from the directory, the process gets into a
locked up state; ^C does not work to kill the process. The only way to
make a directory "readable" is to restart the machine. I have not
noticed any FS corruption, just the lack of being able to enter the
directory.

 At the same time, a kernel bug will be displayed:

Unable to handle kernel NULL pointer dereference at virtual address 00000000
 printing eip:
c016781a
*pde = 00000000
Oops: 0000 [#1]
CPU: 0
EIP: 0060:[find_inode_fast+26/96] Not tainted
EFLAGS: 00010286
EIP is at find_inode_fast+0x1a/0x60
eax: db0355c4 ebx: 0001859f ecx: c3a69844 edx: 00000000
esi: dfd60c00 edi: dff99340 ebp: dff99340 esp: cc6dde50
ds: 007b es: 007b ss: 0068
Process emacs20 (pid: 16508, threadinfo=cc6dc000 task=c6d0adc0)
Stack: c4bca5b8 0001859f 0001859f dfd60c00 c0167d2e dfd60c00 dff99340 0001859f
       0001859f da191d40 dfd60c00 da191d40 c018e45b dfd60c00 0001859f db666130
       fffffff4 dca22aac dca22a44 c015cd60 dca22a44 da191d40 00000000 cc6ddf48
Call Trace:
 [iget_locked+78/160] iget_locked+0x4e/0xa0
 [ext3_lookup+107/208] ext3_lookup+0x6b/0xd0
 [real_lookup+192/240] real_lookup+0xc0/0xf0
 [do_lookup+158/176] do_lookup+0x9e/0xb0
 [link_path_walk+1066/2000] link_path_walk+0x42a/0x7d0
 [__user_walk+73/96] __user_walk+0x49/0x60
 [vfs_stat+31/96] vfs_stat+0x1f/0x60
 [sys_stat64+27/64] sys_stat64+0x1b/0x40
 [syscall_call+7/11] syscall_call+0x7/0xb

Code: 0f 18 02 90 39 59 18 89 c8 74 0f 85 d2 89 d1 75 ed 31 c0 83

On Tue, Jun 10, 2003 at 12:43:23PM +0400, Oleg Drokin wrote:
> Hello!
>
> On Mon, Jun 09, 2003 at 08:35:55PM +0100, Dave Jones wrote:
>
> > 2.5 Bitkeeper tree as of last 24 hrs. Running a lot
> > of disk IO stress (multiple fsstress, over 100 fsx instances,
> > and random sync calling) produced failures on both reiserfs
> > and ext3.
> > Tests were done on seperate disks, but concurrently.
>
> Do you have smp or preempt enabled?
>
> Bye,
> Oleg

-Nathan Conrad

-- 
Nathan J. Conrad
GPG: F4FC 7E25 9308 ECE1 735C  0798 CE86 DA45 9170 3112


- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Jun 15 2003 - 22:00:24 EST