Re: Zram writeback feature unstable with heavy swap utilization - BUG: Bad page state in process...
From: Tino Lehnig
Date: Wed Jul 25 2018 - 11:12:18 EST
Hi,
On 07/25/2018 03:21 PM, Minchan Kim wrote:
It would be much helpful if you could check more versions with git-bisect.
I started bisecting today, but my results are not conclusive yet. It is
certain that the problem started with 4.15 though. I have not
encountered the bug message in 4.15-rc1 so far, but the kvm processes
always became unresponsive after hitting swap and could not be killed
there. I saw the same behavior in rc2, rc3, and other builds in between,
but the bad state bug would only trigger occasionally there. The
behavior in 4.15.18 is the same as in newer kernels.
I also want to reproduce it.
Today, I downloaded one window iso and execute it as cdrom with my owned
compiled kernel on KVM but I couldn't reproduce.
I also tested some heavy swap workload(kernel build with multiple CPU
on small memory) but I failed to reproduce, too.
Please could you told me your method more detail?
I found that running Windows in KVM really is the only reliable method,
maybe because the zero pages are easily compressible. There is actually
not a lot of disk utilization on the backing device when running this test.
My operating system is a minimal install of Debian 9. I took the kernel
configuration from the default Debian kernel and built my own kernel
with "make oldconfig" leaving all settings at their defaults. The only
thing I changed in the configuration was enabling the zram writeback
feature.
All my tests were done on bare-metal hardware with Xeon processors and
lots of RAM. I encounter the bug quite quickly, but it still takes
several GBs of swap usage. Below is my /proc/meminfo with enough KVM
instances running (3 in my case) to trigger the bug on my test machine.
I will also try to reproduce the problem on some different hardware next.
--
MemTotal: 264033384 kB
MemFree: 1232968 kB
MemAvailable: 0 kB
Buffers: 1152 kB
Cached: 5036 kB
SwapCached: 49200 kB
Active: 249955744 kB
Inactive: 5096148 kB
Active(anon): 249953396 kB
Inactive(anon): 5093084 kB
Active(file): 2348 kB
Inactive(file): 3064 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 1073741820 kB
SwapFree: 938603260 kB
Dirty: 68 kB
Writeback: 0 kB
AnonPages: 255007752 kB
Mapped: 4708 kB
Shmem: 1212 kB
Slab: 88500 kB
SReclaimable: 16096 kB
SUnreclaim: 72404 kB
KernelStack: 5040 kB
PageTables: 765560 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 1205758512 kB
Committed_AS: 403586176 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 0 kB
VmallocChunk: 0 kB
HardwareCorrupted: 0 kB
AnonHugePages: 254799872 kB
ShmemHugePages: 0 kB
ShmemPmdMapped: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 75136 kB
DirectMap2M: 10295296 kB
DirectMap1G: 260046848 kB
--
Kind regards,
Tino Lehnig