Re: [linux-next][EXT4][Oops]kernel panics when running fsfuzzer

From: Abdul Haleem
Date: Fri Sep 22 2017 - 06:16:38 EST


On Wed, 2017-09-20 at 16:44 +1000, Michael Ellerman wrote:
> Abdul Haleem <abdhalee@xxxxxxxxxxxxxxxxxx> writes:
>
> > Hi,
> >
> > next kernel panics when running fsfuzzer test on ext4 file system.
> >
> > Machine Type: Power 7 PowerVM LPAR
> > kernel : 4.13.0-next-20170915
> > config : attached
> > Test: fsfuzzer
> >
> > dmesg:
> > -----
> > EXT4-fs (loop1): couldn't mount RDWR because of unsupported optional
> > features (e4000000)
> > JBD2: no valid journal superblock found
> > EXT4-fs (loop1): error loading journal
> > EXT4-fs (loop1): mounted filesystem with ordered data mode. Opts: (null)
> > EXT4-fs error (device loop1): ext4_mb_generate_buddy:756: group 0, block
> > bitmap and bg descriptor inconsistent: 1077 vs 1045 free clusters
> > Unable to handle kernel paging request for data at address 0x00000008
> > Faulting instruction address: 0xd0000000084b0424
> > Oops: Kernel access of bad area, sig: 11 [#1]
> > BE SMP NR_CPUS=2048 NUMA pSeries
> > Dumping ftrace buffer:
> > (ftrace buffer empty)
> > Modules linked in: cramfs(E) dlci(E) 8021q(E) garp(E) mrp(E) af_key(E)
> > ieee802154_socket(E) ieee802154(E) rpcrdma(E) ib_isert(E)
> > iscsi_target_mod(E) ib_iser(E) libiscsi(E) ib_srpt(E) target_core_mod(E)
> > ib_srp(E) hidp(E) ib_ipoib(E) cmtp(E) kernelcapi(E) rdma_ucm(E)
> > ib_ucm(E) bnep(E) ib_uverbs(E) rfcomm(E) bluetooth(E) ib_umad(E)
> > rdma_cm(E) ecdh_generic(E) rfkill(E) ib_cm(E) iw_cm(E) pptp(E) gre(E)
> > l2tp_ppp(E) l2tp_netlink(E) l2tp_core(E) ip6_udp_tunnel(E) udp_tunnel(E)
> > pppoe(E) pppox(E) ppp_generic(E) slhc(E) crypto_user(E) ib_core(E)
> > nfnetlink(E) scsi_transport_iscsi(E) atm(E) sctp(E) dccp_ipv4(E)
> > netlink_diag(E) dccp_diag(E) ip6table_filter(E) af_packet_diag(E)
> > unix_diag(E) tcp_diag(E) udp_diag(E) ebtable_filter(E) bridge(E) sg(E)
> > ibmveth(E) rpadlpar_io(E) loop(E) xt_CHECKSUM(E) iptable_mangle(E)
> > ipt_MASQUERADE(E) nf_nat_masquerade_ipv4(E) iptable_nat(E)
> > nf_nat_ipv4(E) nf_nat(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E)
> > xt_conntrack(E) nf_conntrack(E) ipt_REJECT(E) nf_reject_ipv4(E) tun(E)
> > stp(E) llc(E) rpaphp(E) kvm_pr(E) kvm(E) ebtables(E) ip6_tables(E)
> > dccp(E) inet_diag(E) iptable_filter(E) nfsd(E) auth_rpcgss(E) nfs_acl(E)
> > lockd(E) grace(E) sunrpc(E) binfmt_misc(E) ip_tables(E) ext4(E)
> > mbcache(E) jbd2(E) sd_mod(E) ibmvscsi(E) scsi_transport_srp(E) [last
> > unloaded: netlink_diag]
> > CPU: 15 PID: 32523 Comm: jbd2/loop1-8 Tainted: G W E
> > 4.13.0-next-20170915-autotest #1
>
> These would be easier to read if they weren't word-wrapped.

Yes, I will keep it pre formatted next time.
>
> > task: c000000189664900 task.stack: c0000009f122c000
> > NIP: d0000000084b0424 LR: d0000000084a8040 CTR: c000000000373fe0
> > REGS: c0000009f122f590 TRAP: 0300 Tainted: G W E > (4.13.0-next-20170915-autotest)
> > MSR: 800000000280b032 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI> CR: 28002022
> > XER: 20000000
> > CFAR: c000000000008718 DAR: 0000000000000008 DSISR: 40000000 SOFTE: 1
> > GPR00: d0000000084a8040 c0000009f122f810 d0000000084bf010 > c00000093b66a7d0
> > GPR04: f0000000024ed980 c00000093b66a578 c00000093b66a578 > 0000000000000001
> > GPR08: 0000000000000000 0000000000000000 0000000000000000 > d0000000084b2738
> > GPR12: c000000000373fe0 c00000000e939d80 c000000000118350 > c0000013f0bc22c0
> > GPR16: c0000009f65e4000 c0000009f122fb20 0000000000000000 > c00000108a749824
> > GPR20: c0000009f791cf9c 0000000000000000 c0000009f791cf60 > d0000000084b6cb0
> > GPR24: 0000000000000000 c00000108a749880 0000000000000000 > c0000009f791d300
> > GPR28: c00000093b66a7d0 c0000009f122c000 0000000000000000 > c0000013f8f389b8
> > NIP [d0000000084b0424] .jbd2_journal_put_journal_head+0x94/0x22c [jbd2]
> > LR [d0000000084a8040] .__jbd2_journal_remove_checkpoint+0x70/0x200 > [jbd2]
> > Call Trace:
> > [c0000009f122f810] [d0000000084b04ec] .jbd2_journal_put_journal_head +0x15c/0x22c [jbd2] (unreliable)
> > [c0000009f122f890] [d0000000084a8040] .__jbd2_journal_remove_checkpoint +0x70/0x200 [jbd2]
> > [c0000009f122f930] [d0000000084a8b98] .__jbd2_log_wait_for_space +0x368/0x3d0 [jbd2]
> > [c0000009f122f9c0] [d0000000084a8c5c] .__jbd2_journal_clean_checkpoint_list+0x5c/0xc0 > [jbd2]
> > [c0000009f122fa60] [d0000000084a4e2c] .jbd2_journal_commit_transaction +0x33c/0x2f80 [jbd2]
> > [c0000009f122fc90] [d0000000084ad160] .jbd2_journal_init_inode +0x390/0x5a0 [jbd2]
> > [c0000009f122fd70] [c0000000001184ac] .kthread+0x15c/0x1a0
> > [c0000009f122fe30] [c00000000000b4e4] .ret_from_kernel_thread+0x58/0x74
> > Instruction dump:
> > 7c2004ac 7d40f8a8 7d4a4878 7d40f9ad 40c2fff4 38210080 e8010010 ebc1fff0
> > ebe1fff8 7c0803a6 4e800020 ebdf0040 <e93e000a> 79290fe0 0b090000 e93e0028
> > ---[ end trace bd674540a2bf235c ]---
>
> The instruction dump is:
>
> lwsync
> ldarx r10,0,r31 # probably bit_spin_unlock() in jbd_unlock_bh_journal_head()
> andc r10,r10,r9
> stdcx. r10,0,r31
> bne- 0xfffffffffffffff4
> addi r1,r1,128
> ld r0,16(r1)
> ld r30,-16(r1)
> ld r31,-8(r1)
> mtlr r0
> blr # return
>
> ld r30,64(r31)
> lwa r9,8(r30) # <-- POP, r30 is NULL
> rldicl r9,r9,1,63
> tdnei r9,0 # Most likely a BUG_ON()
> ld r9,40(r30)
>
> Perhaps:
>
> J_ASSERT_JH(jh, jh->b_jcount >= 0);
>
> And given:
>
> struct journal_head {
> /*
> * Points back to our buffer_head. [jbd_lock_bh_journal_head()]
> */
> struct buffer_head *b_bh;
>
> /*
> * Reference count - see description in journal.c
> * [jbd_lock_bh_journal_head()]
> */
> int b_jcount;
>
>
> ie. b_jcount is 8 bytes into the struct, that looks likely.
>
> So we had a jh that was NULL.
>
>
> Is it reproducible?

No, bug is not seen once for 3 re-run from yesterday.

Hope to hit it again in the future CI runs.

--
Regard's

Abdul Haleem
IBM Linux Technology Centre