[next-20101038] Call trace in ext4

From: Sedat Dilek
Date: Thu Oct 28 2010 - 06:05:14 EST


Hi,

I have built today linux-next as of next-20101028 in a non-BKL config
(kernel-config attached) on a Debian i386/sid host.

When I start my quassel IRC-client I get reproducibly this call-trace:

# tail -40 kern.log
Oct 28 11:42:54 tbox kernel: [ 32.872957] EXT3-fs (sdb5): warning:
maximal mount count reached, running e2fsck is recommended
Oct 28 11:42:54 tbox kernel: [ 32.873621] EXT3-fs (sdb5): using
internal journal
Oct 28 11:42:54 tbox kernel: [ 32.873635] EXT3-fs (sdb5): mounted
filesystem with ordered data mode
Oct 28 11:44:16 tbox kernel: [ 115.480401] ------------[ cut here ]------------
Oct 28 11:44:16 tbox kernel: [ 115.480598] kernel BUG at
/home/sd/src/linux-2.6/linux-2.6.36/debian/build/source_i386_none/fs/ext4/inode.c:2721!
Oct 28 11:44:16 tbox kernel: [ 115.480979] invalid opcode: 0000 [#1] SMP
Oct 28 11:44:16 tbox kernel: [ 115.481155] last sysfs file:
/sys/devices/LNXSYSTM:00/device:00/PNP0A03:00/device:01/PNP0C09:00/PNP0C0A:00/power_supply/BAT0/energy_full
Oct 28 11:44:16 tbox kernel: [ 115.481610] Modules linked in: ext3
jbd sco aes_i586 rfcomm bnep aes_generic l2cap bluetooth acpi_cpufreq
mperf cpufreq_powersave cpufreq_userspace cpufreq_stats ppdev
cpufreq_conservative lp dm_crypt binfmt_misc arc4 snd_intel8x0 ecb
snd_intel8x0m radeon thinkpad_acpi snd_ac97_codec ath5k snd_seq_midi
ac97_bus pcmcia snd_pcm_oss joydev snd_rawmidi snd_seq_midi_event ath
mac80211 snd_mixer_oss ttm cfg80211 snd_seq snd_pcm yenta_socket
pcmcia_rsrc drm_kms_helper drm snd_seq_device snd_timer pcmcia_core
i2c_algo_bit nsc_ircc rfkill snd_page_alloc irda i2c_i801 shpchp snd
i2c_core tpm_tis rng_core pci_hotplug psmouse processor tpm soundcore
parport_pc video ac tpm_bios battery button led_class pcspkr evdev
parport power_supply serio_raw nvram output crc_ccitt fuse autofs4
ext4 mbcache jbd2 crc16 dm_mod usbhid hid usb_storage sg sd_mod sr_mod
crc_t10dif cdrom ata_generic ata_piix libata uhci_hcd ehci_hcd usbcore
scsi_mod thermal e1000 thermal_sys floppy nls_base [last unloaded:
scsi
Oct 28 11:44:16 tbox kernel: _wait_scan]
Oct 28 11:44:16 tbox kernel: [ 115.484020]
Oct 28 11:44:16 tbox kernel: [ 115.484020] Pid: 237, comm:
jbd2/sda5-8 Not tainted 2.6.36-next-20101028.1-686 #1 2374SG6/2374SG6
Oct 28 11:44:16 tbox kernel: [ 115.484020] EIP: 0060:[<f86282fb>]
EFLAGS: 00010246 CPU: 0
Oct 28 11:44:16 tbox kernel: [ 115.484020] EIP is at
ext4_writepage+0x8d/0x1f1 [ext4]
Oct 28 11:44:16 tbox kernel: [ 115.484020] EAX: 40020029 EBX:
f76934a0 ECX: 05050030 EDX: 00000000
Oct 28 11:44:16 tbox kernel: [ 115.484020] ESI: 00005050 EDI:
00001000 EBP: f5f3c548 ESP: ef973dbc
Oct 28 11:44:16 tbox kernel: [ 115.484020] DS: 007b ES: 007b FS:
00d8 GS: 00e0 SS: 0068
Oct 28 11:44:16 tbox kernel: [ 115.484020] Process jbd2/sda5-8 (pid:
237, ti=ef972000 task=ef8c9080 task.ti=ef972000)
Oct 28 11:44:16 tbox kernel: [ 115.484020] Stack:
Oct 28 11:44:16 tbox kernel: [ 115.484020] 00000005 00000000
ef973e98 00000000 f5f3c600 ef973e98 00000000 00005050
Oct 28 11:44:16 tbox kernel: [ 115.484020] c108fe6e f76934a0
c1090ea4 00000001 f5f3c600 00000004 00000002 00000000
Oct 28 11:44:16 tbox kernel: [ 115.484020] 00000006 0000000e
ef9a6180 c108fe66 e74172ff 0000000e 00000000 f7693400
Oct 28 11:44:16 tbox kernel: [ 115.484020] Call Trace:
Oct 28 11:44:16 tbox kernel: [ 115.484020] [<c108fe6e>] ? __writepage+0x8/0x1f
Oct 28 11:44:16 tbox kernel: [ 115.484020] [<c1090ea4>] ?
write_cache_pages+0x1cc/0x281
Oct 28 11:44:16 tbox kernel: [ 115.484020] [<c108fe66>] ? __writepage+0x0/0x1f
Oct 28 11:44:16 tbox kernel: [ 115.484020] [<c1090f6f>] ?
generic_writepages+0x16/0x1d
Oct 28 11:44:16 tbox kernel: [ 115.484020] [<f85ab8d9>] ?
journal_submit_data_buffers+0xf5/0x150 [jbd2]
Oct 28 11:44:16 tbox kernel: [ 115.484020] [<f85abd1e>] ?
jbd2_journal_commit_transaction+0x2d1/0xda3 [jbd2]
Oct 28 11:44:16 tbox kernel: [ 115.484020] [<c1025bdb>] ?
dequeue_task_fair+0x1b/0x57
Oct 28 11:44:16 tbox kernel: [ 115.484020] [<c10392df>] ?
lock_timer_base+0x19/0x34
Oct 28 11:44:16 tbox kernel: [ 115.484020] [<c1039356>] ?
try_to_del_timer_sync+0x5c/0x63
Oct 28 11:44:16 tbox kernel: [ 115.484020] [<f85afdf2>] ?
kjournald2+0x9e/0x1c7 [jbd2]
Oct 28 11:44:16 tbox kernel: [ 115.484020] [<c1043ffe>] ?
autoremove_wake_function+0x0/0x29
Oct 28 11:44:16 tbox kernel: [ 115.484020] [<f85afd54>] ?
kjournald2+0x0/0x1c7 [jbd2]
Oct 28 11:44:16 tbox kernel: [ 115.484020] [<c1043cb2>] ? kthread+0x63/0x68
Oct 28 11:44:16 tbox kernel: [ 115.484020] [<c1043c4f>] ? kthread+0x0/0x68
Oct 28 11:44:16 tbox kernel: [ 115.484020] [<c10034fe>] ?
kernel_thread_helper+0x6/0x10
Oct 28 11:44:16 tbox kernel: [ 115.484020] Code: 0c 89 34 24 89 ce 0f
ac d6 0c 39 7c 24 04 75 05 39 34 24 74 07 bf 00 10 00 00 eb 08 89 cf
81 e7 ff 0f 00 00 8b 03 f6 c4 08 75 04 <0f> 0b eb fe c7 04 24 00 00 00
00 83 7b 0c 00 75 37 68 46 ac 62
Oct 28 11:44:16 tbox kernel: [ 115.484020] EIP: [<f86282fb>]
ext4_writepage+0x8d/0x1f1 [ext4] SS:ESP 0068:ef973dbc
Oct 28 11:44:16 tbox kernel: [ 115.564589] ---[ end trace 1b8c420fb1d1ae45 ]---

The code section looks like:
[ fs/ext4/inode.c ]
...
/*
* If the page does not have buffers (for whatever reason),
* try to create them using __block_write_begin. If this
* fails, redirty the page and move on.
*/
if (!page_buffers(page)) { <--- LINE #2721
if (__block_write_begin(page, 0, len,
noalloc_get_block_write)) {
redirty_page:
redirty_page_for_writepage(wbc, page);
unlock_page(page);
return 0;
}
commit_write = 1;
}
...

I am not sure if this is a general ext4 problem or only in a non-BKL setup.

Hope this helps to track down the problem.

Kind Regards,
- Sedat -

P.S.:
I also have tried with patches from [1] and [2], v2 versions attached
to make linux-next happy.

[1] http://lkml.org/lkml/2010/10/28/36
[2] http://lkml.org/lkml/2010/10/28/40

Attachment: config-2.6.36-next-20101028.1-686
Description: Binary data

diff -rdup source_i386_none.orig/fs/ext4/super.c source_i386_none/fs/ext4/super.c
--- source_i386_none.orig/fs/ext4/super.c 2010-10-28 08:24:17.869596252 +0200
+++ source_i386_none/fs/ext4/super.c 2010-10-28 11:31:06.409098833 +0200
@@ -4779,9 +4779,12 @@ static int __init init_ext4_fs(void)
if (err)
goto out3;

+#ifdef CONFIG_EXT4_FS_XATTR
err = init_ext4_xattr();
+#endif
if (err)
goto out2;
+
err = init_inodecache();
if (err)
goto out1;
diff -rdup source_i386_none.orig/fs/ext4/xattr.h source_i386_none/fs/ext4/xattr.h
--- source_i386_none.orig/fs/ext4/xattr.h 2010-10-20 22:30:22.000000000 +0200
+++ source_i386_none/fs/ext4/xattr.h 2010-10-28 11:37:34.644218057 +0200
@@ -122,7 +122,7 @@ ext4_xattr_put_super(struct super_block
}

static inline int
-init_ext4_xattr(void)
+ext4_init_xattr(void)
{
return 0;
}
Nur in source_i386_none/fs/ext4: xattr.h.rej.