Re: Zram writeback feature unstable with heavy swap utilization - BUG: Bad page state in process...

From: Minchan Kim
Date: Fri Jul 27 2018 - 05:14:41 EST


I tried to reproduce with KVM but was not successful and I don't have
real mahcine to reproduce it. I am asking one device for it.

Anyway, I want to try this patch.
Could you apply attached two patches?

On Thu, Jul 26, 2018 at 02:35:15PM +0200, Tino Lehnig wrote:
> On 07/26/2018 12:30 PM, Minchan Kim wrote:
> > Huh, you see it without writeback? It's weird. Without writeback feature,
> > zram operaion is always synchronous on memory compression/decompression
> > so you shouldn't see below io_schedule logic which happens only for
> > asynchronous IO operation.
> > Could you check one more time that it happens without writeback?
>
> Confirmed. More kernel logs below. backing_dev was not set in this run. This
> does not happen with 4.15-rc9 and newer.

I am confusing. You mean after 4.15-rc9, you are not seeing *hung* problem?

>
> > So, you mean you couldn't se bad page state bug until 4.15-rc8?
> > You just see below hung message until 4.15-rc8, not bad page bug?
> >
>
> I did see the bad page state bug first in 4.15-rc2, but only very rarely. I
> have attached the log below the other one. Most test runs from commit
> 0bcac06f27d75 through 4.15-rc8 just resulted in hung task errors.

So you mean you see page state bug with recent kernel right?
It seems there are two problems now.

1. Hung and 2. bad page

What bugs between them happens against what kernel version?
Could you clarify it?

>
> --
>
> [ 363.116153] INFO: task kworker/0:1:130 blocked for more than 120 seconds.
> [ 363.116237] Not tainted 4.15.0-rc8-zram #22
> [ 363.116311] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [ 363.116392] kworker/0:1 D 0 130 2 0x80000000
> [ 363.116405] Workqueue: events async_pf_execute
> [ 363.116407] Call Trace:
> [ 363.116416] ? __schedule+0x3d0/0x850
> [ 363.116419] schedule+0x32/0x80
> [ 363.116426] io_schedule+0x12/0x40
> [ 363.116433] __lock_page_or_retry+0x302/0x320
> [ 363.116437] ? page_cache_tree_insert+0xb0/0xb0
> [ 363.116442] do_swap_page+0x4dd/0x870
> [ 363.116446] __handle_mm_fault+0x790/0x10c0
> [ 363.116450] handle_mm_fault+0xc6/0x1b0
> [ 363.116453] __get_user_pages+0xf9/0x620
> [ 363.116457] get_user_pages_remote+0x137/0x1f0
> [ 363.116462] async_pf_execute+0x62/0x180
> [ 363.116469] process_one_work+0x189/0x380
> [ 363.116474] worker_thread+0x4d/0x3c0
> [ 363.116478] kthread+0xf8/0x130
> [ 363.116482] ? process_one_work+0x380/0x380
> [ 363.116486] ? kthread_create_worker_on_cpu+0x50/0x50
> [ 363.116492] ret_from_fork+0x32/0x40
> [ 363.116498] INFO: task kswapd0:159 blocked for more than 120 seconds.
> [ 363.116576] Not tainted 4.15.0-rc8-zram #22
> [ 363.116650] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [ 363.116731] kswapd0 D 0 159 2 0x80000000
> [ 363.116734] Call Trace:
> [ 363.116738] ? __schedule+0x3d0/0x850
> [ 363.116741] schedule+0x32/0x80
> [ 363.116745] io_schedule+0x12/0x40
> [ 363.116749] __lock_page+0x109/0x130
> [ 363.116753] ? page_cache_tree_insert+0xb0/0xb0
> [ 363.116759] deferred_split_scan+0x1e9/0x2a0
> [ 363.116762] shrink_slab.part.49+0x1e6/0x3d0
> [ 363.116768] shrink_node+0x2e7/0x2f0
> [ 363.116771] kswapd+0x35b/0x6f0
> [ 363.116776] kthread+0xf8/0x130
> [ 363.116779] ? mem_cgroup_shrink_node+0x150/0x150
> [ 363.116782] ? kthread_create_worker_on_cpu+0x50/0x50
> [ 363.116786] ret_from_fork+0x32/0x40
> [ 363.116790] INFO: task kworker/9:1:196 blocked for more than 120 seconds.
> [ 363.116869] Not tainted 4.15.0-rc8-zram #22
> [ 363.116942] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [ 363.117023] kworker/9:1 D 0 196 2 0x80000000
> [ 363.117029] Workqueue: events async_pf_execute
> [ 363.117031] Call Trace:
> [ 363.117034] ? __schedule+0x3d0/0x850
> [ 363.117037] schedule+0x32/0x80
> [ 363.117041] io_schedule+0x12/0x40
> [ 363.117046] __lock_page_or_retry+0x302/0x320
> [ 363.117050] ? page_cache_tree_insert+0xb0/0xb0
> [ 363.117053] do_swap_page+0x4dd/0x870
> [ 363.117057] __handle_mm_fault+0x790/0x10c0
> [ 363.117061] handle_mm_fault+0xc6/0x1b0
> [ 363.117063] __get_user_pages+0xf9/0x620
> [ 363.117068] ? update_load_avg+0x5c0/0x6f0
> [ 363.117071] get_user_pages_remote+0x137/0x1f0
> [ 363.117076] async_pf_execute+0x62/0x180
> [ 363.117081] process_one_work+0x189/0x380
> [ 363.117085] worker_thread+0x4d/0x3c0
> [ 363.117089] kthread+0xf8/0x130
> [ 363.117093] ? process_one_work+0x380/0x380
> [ 363.117096] ? kthread_create_worker_on_cpu+0x50/0x50
> [ 363.117100] ret_from_fork+0x32/0x40
> [ 363.117104] INFO: task kworker/19:1:201 blocked for more than 120
> seconds.
> [ 363.117183] Not tainted 4.15.0-rc8-zram #22
> [ 363.117256] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [ 363.117337] kworker/19:1 D 0 201 2 0x80000000
> [ 363.117343] Workqueue: events async_pf_execute
> [ 363.117345] Call Trace:
> [ 363.117349] ? __schedule+0x3d0/0x850
> [ 363.117352] schedule+0x32/0x80
> [ 363.117356] io_schedule+0x12/0x40
> [ 363.117360] __lock_page_or_retry+0x302/0x320
> [ 363.117364] ? page_cache_tree_insert+0xb0/0xb0
> [ 363.117367] do_swap_page+0x4dd/0x870
> [ 363.117370] __handle_mm_fault+0x790/0x10c0
> [ 363.117374] handle_mm_fault+0xc6/0x1b0
> [ 363.117377] __get_user_pages+0xf9/0x620
> [ 363.117380] ? update_load_avg+0x5c0/0x6f0
> [ 363.117383] get_user_pages_remote+0x137/0x1f0
> [ 363.117388] async_pf_execute+0x62/0x180
> [ 363.117393] process_one_work+0x189/0x380
> [ 363.117397] worker_thread+0x4d/0x3c0
> [ 363.117401] kthread+0xf8/0x130
> [ 363.117404] ? process_one_work+0x380/0x380
> [ 363.117407] ? kthread_create_worker_on_cpu+0x50/0x50
> [ 363.117411] ret_from_fork+0x32/0x40
> [ 363.117415] INFO: task kworker/5:1:207 blocked for more than 120 seconds.
> [ 363.117494] Not tainted 4.15.0-rc8-zram #22
> [ 363.117567] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [ 363.117648] kworker/5:1 D 0 207 2 0x80000000
> [ 363.117654] Workqueue: events async_pf_execute
> [ 363.117655] Call Trace:
> [ 363.117659] ? __schedule+0x3d0/0x850
> [ 363.117662] schedule+0x32/0x80
> [ 363.117665] io_schedule+0x12/0x40
> [ 363.117669] __lock_page_or_retry+0x302/0x320
> [ 363.117673] ? page_cache_tree_insert+0xb0/0xb0
> [ 363.117676] do_swap_page+0x4dd/0x870
> [ 363.117680] ? check_preempt_curr+0x83/0x90
> [ 363.117683] __handle_mm_fault+0x790/0x10c0
> [ 363.117687] handle_mm_fault+0xc6/0x1b0
> [ 363.117689] __get_user_pages+0xf9/0x620
> [ 363.117693] ? update_load_avg+0x5c0/0x6f0
> [ 363.117695] get_user_pages_remote+0x137/0x1f0
> [ 363.117700] async_pf_execute+0x62/0x180
> [ 363.117705] process_one_work+0x189/0x380
> [ 363.117709] worker_thread+0x4d/0x3c0
> [ 363.117713] kthread+0xf8/0x130
> [ 363.117717] ? process_one_work+0x380/0x380
> [ 363.117720] ? kthread_create_worker_on_cpu+0x50/0x50
> [ 363.117723] ret_from_fork+0x32/0x40
> [ 363.117731] INFO: task kworker/5:2:437 blocked for more than 120 seconds.
> [ 363.117828] Not tainted 4.15.0-rc8-zram #22
> [ 363.117920] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [ 363.118039] kworker/5:2 D 0 437 2 0x80000000
> [ 363.118044] Workqueue: events async_pf_execute
> [ 363.118046] Call Trace:
> [ 363.118049] ? __schedule+0x3d0/0x850
> [ 363.118052] schedule+0x32/0x80
> [ 363.118056] io_schedule+0x12/0x40
> [ 363.118060] __lock_page_or_retry+0x302/0x320
> [ 363.118064] ? page_cache_tree_insert+0xb0/0xb0
> [ 363.118067] do_swap_page+0x4dd/0x870
> [ 363.118070] __handle_mm_fault+0x790/0x10c0
> [ 363.118074] handle_mm_fault+0xc6/0x1b0
> [ 363.118077] __get_user_pages+0xf9/0x620
> [ 363.118080] ? update_load_avg+0x5c0/0x6f0
> [ 363.118083] get_user_pages_remote+0x137/0x1f0
> [ 363.118087] async_pf_execute+0x62/0x180
> [ 363.118092] process_one_work+0x189/0x380
> [ 363.118096] worker_thread+0x4d/0x3c0
> [ 363.118100] kthread+0xf8/0x130
> [ 363.118104] ? process_one_work+0x380/0x380
> [ 363.118107] ? kthread_create_worker_on_cpu+0x50/0x50
> [ 363.118113] ? do_group_exit+0x3a/0xa0
> [ 363.118116] ret_from_fork+0x32/0x40
> [ 363.118120] INFO: task kworker/8:2:501 blocked for more than 120 seconds.
> [ 363.118217] Not tainted 4.15.0-rc8-zram #22
> [ 363.118309] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [ 363.118427] kworker/8:2 D 0 501 2 0x80000000
> [ 363.118433] Workqueue: events async_pf_execute
> [ 363.118435] Call Trace:
> [ 363.118438] ? __schedule+0x3d0/0x850
> [ 363.118441] schedule+0x32/0x80
> [ 363.118445] io_schedule+0x12/0x40
> [ 363.118448] __lock_page_or_retry+0x302/0x320
> [ 363.118452] ? page_cache_tree_insert+0xb0/0xb0
> [ 363.118455] do_swap_page+0x4dd/0x870
> [ 363.118459] __handle_mm_fault+0x790/0x10c0
> [ 363.118463] handle_mm_fault+0xc6/0x1b0
> [ 363.118465] __get_user_pages+0xf9/0x620
> [ 363.118469] ? update_load_avg+0x5c0/0x6f0
> [ 363.118471] get_user_pages_remote+0x137/0x1f0
> [ 363.118476] async_pf_execute+0x62/0x180
> [ 363.118481] process_one_work+0x189/0x380
> [ 363.118485] worker_thread+0x4d/0x3c0
> [ 363.118489] kthread+0xf8/0x130
> [ 363.118492] ? process_one_work+0x380/0x380
> [ 363.118495] ? kthread_create_worker_on_cpu+0x50/0x50
> [ 363.118498] ? kthread_create_worker_on_cpu+0x50/0x50
> [ 363.118502] ret_from_fork+0x32/0x40
> [ 363.118511] INFO: task qemu-system-x86:943 blocked for more than 120
> seconds.
> [ 363.118609] Not tainted 4.15.0-rc8-zram #22
> [ 363.118701] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [ 363.118820] qemu-system-x86 D 0 943 931 0x00000000
> [ 363.118822] Call Trace:
> [ 363.118826] ? __schedule+0x3d0/0x850
> [ 363.118829] schedule+0x32/0x80
> [ 363.118833] io_schedule+0x12/0x40
> [ 363.118837] __lock_page+0x109/0x130
> [ 363.118840] ? page_cache_tree_insert+0xb0/0xb0
> [ 363.118844] deferred_split_scan+0x1e9/0x2a0
> [ 363.118848] shrink_slab.part.49+0x1e6/0x3d0
> [ 363.118852] shrink_node+0x2e7/0x2f0
> [ 363.118856] do_try_to_free_pages+0xd5/0x320
> [ 363.118859] try_to_free_pages+0xd6/0x190
> [ 363.118864] __alloc_pages_slowpath+0x34d/0xdc0
> [ 363.118869] __alloc_pages_nodemask+0x214/0x230
> [ 363.118873] do_huge_pmd_anonymous_page+0x13a/0x610
> [ 363.118877] __handle_mm_fault+0xe04/0x10c0
> [ 363.118879] ? kvm_vcpu_mmap+0x20/0x20
> [ 363.118884] ? bsearch+0x52/0x90
> [ 363.118887] handle_mm_fault+0xc6/0x1b0
> [ 363.118890] __get_user_pages+0xf9/0x620
> [ 363.118893] get_user_pages+0x3e/0x50
> [ 363.118898] __gfn_to_pfn_memslot+0xff/0x3d0
> [ 363.118903] try_async_pf+0x53/0x1d0
> [ 363.118907] tdp_page_fault+0x10e/0x260
> [ 363.118912] ? vmexit_fill_RSB+0x11/0x30
> [ 363.118915] kvm_mmu_page_fault+0x59/0x130
> [ 363.118920] vmx_handle_exit+0x9f/0x1530
> [ 363.118924] ? vmexit_fill_RSB+0x11/0x30
> [ 363.118927] ? vmx_vcpu_run+0x32f/0x4d0
> [ 363.118932] kvm_arch_vcpu_ioctl_run+0x90c/0x1670
> [ 363.118936] ? handle_machine_check+0x10/0x10
> [ 363.118938] ? kvm_arch_vcpu_load+0x68/0x250
> [ 363.118943] ? kvm_vcpu_ioctl+0x2e8/0x580
> [ 363.118946] kvm_vcpu_ioctl+0x2e8/0x580
> [ 363.118952] do_vfs_ioctl+0x92/0x5f0
> [ 363.118955] ? handle_mm_fault+0xc6/0x1b0
> [ 363.118960] ? kvm_on_user_return+0x68/0xa0
> [ 363.118964] SyS_ioctl+0x74/0x80
> [ 363.118969] entry_SYSCALL_64_fastpath+0x24/0x87
> [ 363.118972] RIP: 0033:0x7fcc51943dd7
>
> --
>
> [ 967.078557] BUG: Bad page state in process qemu-system-x86 pfn:3f6d853
> [ 967.078637] page:000000004d26db38 count:0 mapcount:0 mapping: (null)
> index:0x1
> [ 967.078715] flags: 0x17fffc000000008(uptodate)
> [ 967.078786] raw: 017fffc000000008 0000000000000000 0000000000000001
> 00000000ffffffff
> [ 967.078863] raw: dead000000000100 dead000000000200 0000000000000000
> 0000000000000000
> [ 967.078939] page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag set
> [ 967.079012] bad because of flags: 0x8(uptodate)
> [ 967.079081] Modules linked in: lz4 lz4_compress zram
> [ 967.079085] CPU: 2 PID: 946 Comm: qemu-system-x86 Not tainted
> 4.15.0-rc2-zram #4
> [ 967.079086] Hardware name: Supermicro Super Server/X10SRL-F, BIOS 2.0b
> 05/02/2017
> [ 967.079087] Call Trace:
> [ 967.079093] dump_stack+0x5c/0x84
> [ 967.079097] bad_page+0xba/0x120
> [ 967.079098] get_page_from_freelist+0xfe7/0x1230
> [ 967.079101] __alloc_pages_nodemask+0xea/0x230
> [ 967.079104] alloc_pages_vma+0x7c/0x1c0
> [ 967.079106] do_swap_page+0x474/0x870
> [ 967.079109] ? do_huge_pmd_anonymous_page+0x417/0x610
> [ 967.079110] __handle_mm_fault+0xa53/0x1160
> [ 967.079112] handle_mm_fault+0xc6/0x1b0
> [ 967.079114] __get_user_pages+0xf9/0x620
> [ 967.079117] get_user_pages+0x3e/0x50
> [ 967.079119] __gfn_to_pfn_memslot+0xff/0x3d0
> [ 967.079122] try_async_pf+0x53/0x1c0
> [ 967.079124] tdp_page_fault+0x10e/0x260
> [ 967.079125] kvm_mmu_page_fault+0x53/0x130
> [ 967.079128] vmx_handle_exit+0x9c/0x1500
> [ 967.079129] ? atomic_switch_perf_msrs+0x5f/0x80
> [ 967.079130] ? vmx_vcpu_run+0x30a/0x4b0
> [ 967.079133] kvm_arch_vcpu_ioctl_run+0x8dc/0x15e0
> [ 967.079135] ? kvm_arch_vcpu_load+0x62/0x230
> [ 967.079136] ? kvm_vcpu_ioctl+0x2e8/0x580
> [ 967.079137] kvm_vcpu_ioctl+0x2e8/0x580
> [ 967.079141] ? wake_up_q+0x70/0x70
> [ 967.079144] do_vfs_ioctl+0x8f/0x5e0
> [ 967.079147] ? kvm_on_user_return+0x68/0xa0
> [ 967.079148] SyS_ioctl+0x74/0x80
> [ 967.079152] entry_SYSCALL_64_fastpath+0x1e/0x81
> [ 967.079154] RIP: 0033:0x7f4410161dd7
> [ 967.079154] RSP: 002b:00007f43eeffc8b8 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000010
> [ 967.079156] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX:
> 00007f4410161dd7
> [ 967.079156] RDX: 0000000000000000 RSI: 000000000000ae80 RDI:
> 0000000000000013
> [ 967.079157] RBP: 000055c5a36f8e50 R08: 000055c5a0fa73d0 R09:
> 00000000ffffffff
> [ 967.079158] R10: 003b83305f306bdf R11: 0000000000000246 R12:
> 0000000000000000
> [ 967.079158] R13: 00007f44157ad000 R14: 0000000000000000 R15:
> 000055c5a36f8e50
> [ 967.079160] Disabling lock debugging due to kernel taint
>
> --
> Kind regards,
>
> Tino Lehnig