Re: next-20241001: WARNING: at mm/list_lru.c:77 list_lru_del (mm/list_lru.c:212 mm/list_lru.c:200)
From: Dan Carpenter
Date: Wed Oct 02 2024 - 07:24:35 EST
Let's add Kairui Song to the CC list.
One simple thing is that we should add a READ_ONCE() to the comparison. Naresh,
could you test the attached diff? I don't know that it will fix it but it's
worth checking the easy stuff first.
regards,
dan carpenter
On Wed, Oct 02, 2024 at 04:40:36PM +0530, Naresh Kamboju wrote:
> The following kernel warnings have been occurring on arm64 DUT and qemu-arm64
> running Linux next-20240930, next-20241001 and next-20241002 while
> booting the kernel.
>
> This is an intermittent warning noticed on arm64
> - Juno-r2
> - Dragonboard-410c
> - Qemu-arm64
>
> First seen on next-20240930
>
> Good: next-20240927
> BAD: next-20240930..next-20241002
>
> Since this is an intermittent problem hard to bisect.
>
> Reported-by: Linux Kernel Functional Testing <lkft@xxxxxxxxxx>
>
> Warning log:
> ----------
> <4>[ 26.293906] ------------[ cut here ]------------
> <4>[ 26.295948] WARNING: CPU: 1 PID: 1 at mm/list_lru.c:77
> list_lru_del (mm/list_lru.c:212 mm/list_lru.c:200)
> <4>[ 26.299608] Modules linked in: fuse drm backlight ip_tables x_tables
> <4>[ 26.308212] CPU: 1 UID: 0 PID: 1 Comm: systemd Not tainted
> 6.12.0-rc1-next-20241001 #1
> <4>[ 26.310552] Hardware name: linux,dummy-virt (DT)
> <4>[ 26.313304] pstate: 23400009 (nzCv daif +PAN -UAO +TCO +DIT
> -SSBS BTYPE=--)
> <4>[ 26.315519] pc : list_lru_del (mm/list_lru.c:212 mm/list_lru.c:200)
> <4>[ 26.316457] lr : list_lru_del (mm/list_lru.c:76 mm/list_lru.c:200)
> <4>[ 26.317603] sp : ffff80008002b950
> <4>[ 26.319015] x29: ffff80008002b950 x28: fff00000c0540240 x27:
> 0000000000000000
> <4>[ 26.321155] x26: fff00000c2dce690 x25: 8000000000000000 x24:
> 0000000000000000
> <4>[ 26.322653] x23: fff00000c0c4e900 x22: fff00000c12f4478 x21:
> fff00000c12f4458
> <4>[ 26.324697] x20: fff00000c1b14800 x19: fff00000c0542088 x18:
> 0000000000000000
> <4>[ 26.326121] x17: 0000000000000000 x16: 0000000000000000 x15:
> 0000000000000000
> <4>[ 26.327590] x14: 0000000000000000 x13: fff00000c146b940 x12:
> 0000000000000005
> <4>[ 26.329087] x11: 0000000000000000 x10: 0000000000000402 x9 :
> 0000000000000003
> <4>[ 26.330650] x8 : ffffffffffffffff x7 : 0000000023d53570 x6 :
> 0000000023d53570
> <4>[ 26.332484] x5 : 00000000000f000c x4 : ffffc1ffc3032e20 x3 :
> fff00000c2f70800
> <4>[ 26.334759] x2 : 0000000000000000 x1 : 0000000000000000 x0 :
> 0000000000000001
> <4>[ 26.338095] Call trace:
> <4>[ 26.339907] list_lru_del (mm/list_lru.c:212 mm/list_lru.c:200)
> <4>[ 26.340990] list_lru_del_obj (mm/list_lru.c:221)
> <4>[ 26.341972] d_lru_del (fs/dcache.c:463)
> <4>[ 26.342794] to_shrink_list (fs/dcache.c:477 fs/dcache.c:887)
> <4>[ 26.343615] select_collect (fs/dcache.c:0)
> <4>[ 26.344524] d_walk (fs/dcache.c:1278)
> <4>[ 26.345384] shrink_dcache_parent (include/linux/list.h:373 fs/dcache.c:1511)
> <4>[ 26.346512] d_invalidate (fs/dcache.c:1617)
> <4>[ 26.347451] proc_invalidate_siblings_dcache (fs/proc/inode.c:143)
> <4>[ 26.348744] proc_flush_pid (fs/proc/base.c:3480)
> <4>[ 26.349747] release_task (kernel/exit.c:281)
> <4>[ 26.350810] wait_consider_task (kernel/exit.c:1253 kernel/exit.c:1477)
> <4>[ 26.352093] __do_wait (kernel/exit.c:1617 kernel/exit.c:1651)
> <4>[ 26.353151] do_wait (kernel/exit.c:1693)
> <4>[ 26.353958] __arm64_sys_waitid (kernel/exit.c:1775
> kernel/exit.c:1788 kernel/exit.c:1783 kernel/exit.c:1783)
> <4>[ 26.359772] invoke_syscall (arch/arm64/kernel/syscall.c:50)
> <4>[ 26.360706] el0_svc_common (include/linux/thread_info.h:127
> arch/arm64/kernel/syscall.c:140)
> <4>[ 26.361477] do_el0_svc (arch/arm64/kernel/syscall.c:152)
> <4>[ 26.362218] el0_svc (arch/arm64/kernel/entry-common.c:165
> arch/arm64/kernel/entry-common.c:178
> arch/arm64/kernel/entry-common.c:713)
> <4>[ 26.363014] el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:765)
> <4>[ 26.364138] el0t_64_sync (arch/arm64/kernel/entry.S:598)
> <4>[ 26.365321] ---[ end trace 0000000000000000 ]---
>
> boot Log links,
> --------
> - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20241001/testrun/25235075/suite/log-parser-boot/test/check-kernel-exception-warning-cpu-pid-at-mmlist_lruc-list_lru_del/log
> - https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/2mp2m5m4PnjJgdix32h7pIGe63Y/logs?format=html
>
> Test results history:
> ----------
> - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20241002/testrun/25242215/suite/log-parser-boot/test/check-kernel-exception-warning-cpu-pid-at-mmlist_lruc-list_lru_del/history/
>
> metadata:
> ----
> git describe: next-20241001
> git repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next
> git sha: 77df9e4bb2224d8ffbddec04c333a9d7965dad6c
> kernel config:
> - https://storage.tuxsuite.com/public/linaro/lkft/builds/2mp2jhmSKhlF6c0x1SBsJFyBbTq/config
> build url: https://storage.tuxsuite.com/public/linaro/lkft/builds/2mp2jhmSKhlF6c0x1SBsJFyBbTq/
> toolchain: clang-19 and gcc-13
>
> Steps to reproduce:
> ---------
> - https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/2mp2m5m4PnjJgdix32h7pIGe63Y/reproducer
> - https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/2mp2m5m4PnjJgdix32h7pIGe63Y/tux_plan
>
> --
> Linaro LKFT
> https://lkft.linaro.org
diff --git a/mm/list_lru.c b/mm/list_lru.c
index 79c2d21504a2..a9a8b02e056a 100644
--- a/mm/list_lru.c
+++ b/mm/list_lru.c
@@ -74,7 +74,7 @@ lock_list_lru_of_memcg(struct list_lru *lru, int nid, struct mem_cgroup *memcg,
else
spin_lock(&l->lock);
if (likely(READ_ONCE(l->nr_items) != LONG_MIN)) {
- WARN_ON(l->nr_items < 0);
+ WARN_ON(READ_ONCE(l->nr_items) < 0);
rcu_read_unlock();
return l;
}