Bad Page State In File System Related Opeartions (3.10.35)

From: Kai
Date: Sun Feb 15 2015 - 11:42:23 EST


While doing data transfer between several hosts with different file systems/protocols, I always get errors as below:


[ 5257.087865] BUG: Bad page state in process ftpd pfn:2150f
[ 5257.093524] page:ffffea0000749b48 count:0 mapcount:0 mapping: (null) index:0x2
[ 5257.101547] page flags: 0x4000000000000004(referenced)

[ 5257.107003] Modules linked in: iscsi_tcp libiscsi_tcp libiscsi iscsi_target_mod(O) target_core_mod(O) iscsi_extent_pool(PO) iscsi_rodsp(PO) configfs nfsd exportfs rpcsec_gss_krb5 cifs udf isofs loop hid_generic usbhid hid usblp usb_storage btrfs zlib_deflate libcrc32c hfsplus md4 hmac tn40xx(O) ixgbe(O) be2net igb(O) i2c_algo_bit e1000e(O) dca fuse vfat fat crc32c_intel cryptd ecryptfs sha512_generic sha256_generic sha1_generic ecb aes_x86_64 authenc des_generic ansi_cprng cts md5 cbc cpufreq_conservative cpufreq_powersave cpufreq_performance cpufreq_ondemand acpi_cpufreq mperf processor thermal_sys cpufreq_stats freq_table dm_snapshot crc_itu_t crc_ccitt quota_v2 quota_tree psnap p8022 llc tunnel4 ipv6 zram(C) sg etxhci_hcd xhci_hcd ehci_pci ehci_hcd uhci_hcd usbcore usb_common [last unloaded: configfs]

[ 5257.184511] CPU: 0 PID: 17627 Comm: ftpd Tainted: P WC O 3.10.35 #1
[ 5257.191668] Hardware name: To be filled by O.E.M. To be filled by O.E.M./To be filled by O.E.M., BIOS 4.6.4 09/28/2012
[ 5257.202404] ffffffff8149bbc3 0000000000000000 ffffffff8149a6d1 0000000000000001
[ 5257.209964] ffffffff810b2cac 000000000000001f 0000000000000246 000213da00000000
[ 5257.217425] 0000000000000000 ffff880073171fd8 0000000000000030 0000000000000001
[ 5257.224967] Call Trace:
[ 5257.227426] [<ffffffff8149bbc3>] ? dump_stack+0xd/0x17
[ 5257.232701] [<ffffffff8149a6d1>] ? bad_page+0xd4/0xeb
[ 5257.237940] [<ffffffff810b2cac>] ? get_page_from_freelist+0x57c/0x660
[ 5257.244635] [<ffffffff8133b315>] ? ata_qc_issue+0x195/0x3e0
[ 5257.250509] [<ffffffff810b2ee1>] ? __alloc_pages_nodemask+0x151/0x7f0
[ 5257.257237] [<ffffffff81000fa0>] ? __switch_to+0x3d0/0x460
[ 5257.262976] [<ffffffff8149f67e>] ? __schedule+0x25e/0x5c0
[ 5257.268655] [<ffffffff810b590c>] ? __do_page_cache_readahead+0xec/0x250
[ 5257.275611] [<ffffffff813b11d3>] ? dm_any_congested+0x63/0x80
[ 5257.281676] [<ffffffff810b5cdc>] ? ra_submit+0x1c/0x30
[ 5257.287086] [<ffffffff810ac8c1>] ? generic_file_aio_read+0x481/0x6f0
[ 5257.293756] [<ffffffff810e775a>] ? do_sync_read+0x6a/0xa0
[ 5257.299719] [<ffffffff810e8940>] ? vfs_read+0xa0/0x160
[ 5257.305335] [<ffffffff810ee921>] ? kernel_read+0x41/0x60
[ 5257.311060] [<ffffffffa01bbca9>] ? ecryptfs_decrypt_page+0x1f9/0x390 [ecryptfs]
[ 5257.318731] [<ffffffff810587b5>] ? __wake_up_common+0x55/0x90
[ 5257.324998] [<ffffffff810591c3>] ? __wake_up+0x43/0x70
[ 5257.330498] [<ffffffffa01b9968>] ? ecryptfs_readpage+0xd8/0x120 [ecryptfs]
[ 5257.337953] [<ffffffff81115878>] ? __generic_file_splice_read+0x3c8/0x540
[ 5257.345346] [<ffffffff81113e60>] ? page_cache_pipe_buf_release+0x20/0x20
[ 5257.352143] [<ffffffff813c9a46>] ? kernel_sendpage+0x16/0x30
[ 5257.358282] [<ffffffff813c9a82>] ? sock_sendpage+0x22/0x30
[ 5257.364188] [<ffffffff81113cfd>] ? pipe_to_sendpage+0x4d/0x80
[ 5257.370355] [<ffffffff81113e4c>] ? page_cache_pipe_buf_release+0xc/0x20
[ 5257.377369] [<ffffffff81113de1>] ? splice_from_pipe_feed+0xb1/0x110
[ 5257.384019] [<ffffffff81113cb0>] ? splice_from_pipe_begin+0x10/0x10
[ 5257.390536] [<ffffffff8111400e>] ? __splice_from_pipe+0x3e/0x80
[ 5257.396793] [<ffffffff81115ab8>] ? splice_from_pipe+0x58/0x70
[ 5257.402884] [<ffffffff81115a24>] ? generic_file_splice_read+0x34/0x70
[ 5257.409627] [<ffffffff811149eb>] ? splice_direct_to_actor+0x9b/0x1c0
[ 5257.416250] [<ffffffff81114390>] ? do_splice_from+0x130/0x130
[ 5257.422227] [<ffffffff81115b53>] ? do_splice_direct+0x53/0x70
[ 5257.428234] [<ffffffff810e7d1f>] ? do_sendfile+0x1bf/0x560
[ 5257.433901] [<ffffffff8102c5cf>] ? sys32_fstat64+0x1f/0x30
[ 5257.439750] [<ffffffff810e9b45>] ? SyS_sendfile64+0x55/0xa0
[ 5257.445602] [<ffffffff814a2391>] ? sysenter_dispatch+0x7/0x1e


The paths of the call-traces are random. However, the processes hitting the bad pages are usually cp, somtimes the processes accessing files like ftpd in this case.

When the fields count, mapcount, and map of a page are all 0 or NULL, is it valid that the page has the flag referenced set? I am wondering if this is a use-after-free case.

I also find that, after I remove the hfsplus volumes, this issue disappears.

I have traced into the kernel code for any potnetial racing with the finction mark_page_accessed() but find nothing.

Would any one suggest about this issue? All the suggestions and comments are appreciated.

Kai Gnep

---
Avast éæèéåæææåéåéäççæã
http://www.avast.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/