Re: ext3 on latest -git: BUG: unable to handle kernel NULL pointer dereference at 0000000c

From: Vegard Nossum
Date: Fri Jul 18 2008 - 16:28:26 EST

On Fri, Jul 18, 2008 at 1:58 PM, Vegard Nossum <vegard.nossum@xxxxxxxxx> wrote:
> On Fri, Jul 18, 2008 at 1:20 PM, Josef Bacik <jbacik@xxxxxxxxxx> wrote:
>>> You can see the full log at
>>> which shows that
>>> it already survived a lot of failures, so I'm guessing your patch was
>>> correct and we just hit a different case. What do you think?
>> Yeah you are right, its like a shitty game of wack-a-mole. Heres another patch,
>> same thing as last time, pull the other one out put this one on. Thanks,
> It seems to hold up -- no stacktraces, but lots of IO failures.
> I would leave it in testing for a bit more, but I've got to run; I'll
> give it another go when I get home.

Ok, we still got this:

BUG: unable to handle kernel NULL pointer dereference at 0000000c
IP: [<c025ea28>] journal_dirty_metadata+0xb8/0x1b0
*pde = 00000000
Pid: 4770, comm: rm Not tainted (2.6.26-03421-g253a722 #49)
EIP: 0060:[<c025ea28>] EFLAGS: 00210246 CPU: 1
EIP is at journal_dirty_metadata+0xb8/0x1b0
EAX: 00000000 EBX: f3d70c90 ECX: 00000001 EDX: f3e12000
ESI: 00000000 EDI: f21118f0 EBP: f3e13d94 ESP: f3e13d6c
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process rm (pid: 4770, ti=f3e12000 task=f62cdfa0 task.ti=f3e12000)
Stack: f3d70430 f578047c f578047c f3e13d94 c0222cdb f779c000 f6ff2e70 f21118f0
f779c000 f21118f0 f3e13db4 c02345ef 0000001c 00001499 c0760bc4 f21118f0
00000000 ef36d004 f3e13de4 c0228e6f 0000147e 0000001c ef36d004 ef36d400
Call Trace:
[<c0222cdb>] ? ext3_free_blocks+0x6b/0xa0
[<c02345ef>] ? __ext3_journal_dirty_metadata+0x1f/0x50
[<c0228e6f>] ? ext3_free_data+0x9f/0x100
[<c02290e3>] ? ext3_free_branches+0x213/0x220
[<c0222cdb>] ? ext3_free_blocks+0x6b/0xa0
[<c0228f7e>] ? ext3_free_branches+0xae/0x220
[<c022967c>] ? ext3_truncate+0x58c/0x940
[<c015ad96>] ? trace_hardirqs_on_caller+0x116/0x170
[<c0260733>] ? journal_start+0xd3/0x110
[<c0260710>] ? journal_start+0xb0/0x110
[<c0229b07>] ? ext3_delete_inode+0xd7/0xe0
[<c0229a30>] ? ext3_delete_inode+0x0/0xe0
[<c01b9bc1>] ? generic_delete_inode+0x81/0x120
[<c01b9d87>] ? generic_drop_inode+0x127/0x180
[<c01b8c07>] ? iput+0x47/0x50
[<c01af1dc>] ? do_unlinkat+0xec/0x170
[<c01b187b>] ? vfs_readdir+0x6b/0xa0
[<c01b1560>] ? filldir64+0x0/0xf0
[<c0430a08>] ? trace_hardirqs_on_thunk+0xc/0x10
[<c015ad96>] ? trace_hardirqs_on_caller+0x116/0x170
[<c01af3a3>] ? sys_unlinkat+0x23/0x50
[<c010407f>] ? sysenter_past_esp+0x78/0xc5
Code: b8 01 00 00 00 e8 c9 3f ed ff 89 e0 25 00 e0 ff ff f6 40 08 08
74 05 e8 47 98 4e 00 83 c4 1c 31 c0 5b 5e 5f 5d c3 90 8d 74 26 00 <8b>
46 0c 85 c0 0f 84 8d 00 00 00 8b 45 f0 39 46 18 74 66 8d 47
EIP: [<c025ea28>] journal_dirty_metadata+0xb8/0x1b0 SS:ESP 0068:f3e13d6c
Kernel panic - not syncing: Fatal exception

It looks similar to one of the others we saw. Are you sure I should
back out all your previous patches? My stack looks like this:

Duane Griffin (1):
ext3: validate directory entry

Josef Bacik (1):
ext3 on latest -git: BUG: unable to handle kernel NULL pointer dereference

And I am using error=continue.

Now I've modified my scripts to also save the bad image, so I (or
whomever) can re-test a specific crash easily. For instance, this one
can be downloaded from and mounted.
Then you run rm -rf mnt/* and it should crash.

Log is also available at


"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036
