Re: SIOCGIWRATE (and others) leaking SLAB (kmalloc-2048)
From: Stefan Seyfried
Date: Sun Sep 30 2018 - 05:43:43 EST
Am 30.09.18 um 11:24 schrieb Stefan Seyfried:
> Hi all,
>
> I'm running the openSUSE provided latest rc kernels and found, that
> after 2 weeks, about 4GB of memory had leaked into kmalloc-2048.
> Investigating with trace-cmd lead me to the gkrellm-wifi plugin, which
> periodically polls via wireless extension ioctls.
>
> This is an easy to use reproducer:
> [...]
>
> http://paste.opensuse.org/75377254
>
> # grep ^kmalloc-2048 /proc/slabinfo
> kmalloc-2048 168090 168106 2048 2 1 : tunables 24 12 8 : slabdata 84053
> 84053 0
> # ./leak air
> # grep ^kmalloc-2048 /proc/slabinfo
> kmalloc-2048 178086 178086 2048 2 1 : tunables 24 12 8 : slabdata 89043
> 89043 0
>
> The same happens with SIOCGIWSTATS and when reading /proc/net/wireless:
>
> #!/bin/bash
> for ((i=0; i<10000; i++)); do
> while read line; do :; done < /proc/net/wireless
> done
>
> Thie lets me suspect an issue with rdev_get_station, but OTOH "iw air
> station dump" which I think also calls rdev_get_station via
> nl80211_get_station() does not seem to leak.
trace-cmd gives this for the above script:
%99.28 (15289) leak.sh kmalloc #10001
|
--- *kmalloc*
kmem_cache_alloc_trace
cfg80211_sinfo_alloc_tid_stats
sta_set_sinfo
ieee80211_get_station
cfg80211_wireless_stats
wireless_dev_seq_show
seq_read
proc_reg_read
__vfs_read
vfs_read
ksys_read
do_syscall_64
entry_SYSCALL_64_after_hwframe
("trace-cmd record -T -e kmalloc -f bytes_alloc==2048", then run the
script, then "trace-cmd hist trace.dat")
This points me to sta_set_sinfo and then to commit
commit 0fdf1493b41eb64fc7e8c8e1b8830a4bd8c4bbca
Author: Johannes Berg <johannes.berg@xxxxxxxxx>
Date: Fri May 18 11:40:44 2018 +0200
mac80211: allocate and fill tidstats only when needed
This fixes memory leaks in the case where we just have the
station info on the stack for internal usage without sending
it to cfg80211.
which seems to be incomplete
> I did not see this with 4.18, I have noticed this with 4.19-rc3 after 13
> days and still see it with -rc5.
Looking at the commit, the bug might have been already in 4.18, which
I'm trying to confirm right now.
--
Stefan Seyfried
"For a successful technology, reality must take precedence over
public relations, for nature cannot be fooled." -- Richard Feynman