Re: [PATCH v7 00/10] per lruvec lru_lock for memcg

From: Alex Shi
Date: Tue Jan 14 2020 - 04:16:29 EST



> I tried with the mods you had appended, from [PATCH v7 02/10]
> discussion with Konstantion: no, still crashes in a similar way.>
> Does your github tree have other changes too? I see it says "Latest
> commit e05d0dd 22 days ago", which doesn't seem to fit. Afraid I
> don't have time to test many variations.

Thanks a lot for testing! the github version is same as your tested.
The github branches page has a bug, it don't show correct update time.
https://github.com/alexshi/linux/branches while detailed page does.
https://github.com/alexshi/linux/tree/lru-next
>
> It looks like, in my case, systemd was usually jumping in and doing
> something with shmem (perhaps via memfd) that read back from swap
> and triggered the crash without any further intervention from me.
>
> So please try booting with mem=700M and 1.5G swap,
> mount -t tmpfs -o size=470M tmpfs /tst
> cp /dev/zero /tst; cp /tst/zero /dev/null
>
> That's enough to crash it for me, without getting into any losetup or
> systemd complications. But you might have to adjust the numbers to be
> sure of writing out and reading back from swap.
>
> It's swap to SSD in my case, don't think that matters. I happen to
> run with swappiness 100 (precisely to help generate swap problems),
> but swappiness 60 is good enough to get these crashes.
>

I did use 700M memory and 1.5G swapfile in my qemu, but with a swapfile
not a disk.
qemu-system-x86_64 -smp 4 -enable-kvm -cpu SandyBridge \
-m 700M -kernel /home/kuiliang.as/linux/qemulru/arch/x86/boot/bzImage \
-append "earlyprintk=ttyS0 root=/dev/sda1 console=ttyS0 debug crashkernel=128M printk.devkmsg=on " \
-hda /home/kuiliang.as/rootimages/CentOS-7-x86_64-Azure-1703.qcow2 \
-hdb /home/kuiliang.as/rootimages/hdb.qcow2 \
--nographic \

Anyway, although I didn't reproduced the bug. but I found a bug in my
debug function:
VM_BUG_ON_PAGE(lruvec_memcg(lruvec) != page->mem_cgroup, page);

if !page->mem_cgroup, the bug could be triggered, so, seems it's a bug
for debug function, not real issue. The 9th patch should be replaced by
the following new patch.

Many thanks for testing!
Alex