[BUG] 4.10-rc8 - ping spinning?

From: lkml
Date: Thu Feb 16 2017 - 06:03:05 EST


Hello list,

Some rtl8192cu bugs of old got me in the habit of running ping in a shelved (i.e. forgotten) xterm, a harmless practice which seemed to prevent the rtl8192cu device from dying.

This evening the system started getting very slow and to my surprise I found
this in `top`:
5115 swivel 30 10 14772 1928 1756 R 90.9 0.0 1351:41 ping
9005 swivel 30 10 14772 1892 1724 R 90.9 0.0 1354:26 ping

This is a dual core machine (X61s, core2duo 1.8Ghz), those processes are
burning all the free CPU in the system context. They're identical commands,
just plain `ping domain.com`, to the same host. It appears I accidentally
(fortuitously?) had two running, which made this event more interesting.

I can assert that these did not begin spinning simultaneously - as you can see
by the cumulative time in `top` there's a small delta. I also use a window
manager with builtin continuous process monitoring, and when I noticed this was
happening I was able to see that one of the processes had only recently begun
spinning, the other was spinning long enough to have its start fall off the
chart (at least ~17 minutes ago).

This hasn't occurred before AFAIK, but I haven't spent much time in 4.10 yet.
I'm pretty confident this didn't happen in 4.9 which I ran for quite a while.

`strace` of one of the aforementioned processes:

1487241315.073568 poll([{fd=3, events=POLLIN|POLLERR}], 1, 927) = 1 ([{fd=3, revents=POLLERR}]) <0.000022>
1487241315.073665 recvmsg(3, 0x7ffc8e05e260, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable) <0.000020>
1487241315.073747 gettimeofday({1487241315, 73774}, NULL) = 0 <0.000021>
1487241315.073829 poll([{fd=3, events=POLLIN|POLLERR}], 1, 927) = 1 ([{fd=3, revents=POLLERR}]) <0.000025>
1487241315.073927 recvmsg(3, 0x7ffc8e05e260, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable) <0.000020>
1487241315.074024 gettimeofday({1487241315, 74050}, NULL) = 0 <0.000256>
1487241315.074352 poll([{fd=3, events=POLLIN|POLLERR}], 1, 927) = 1 ([{fd=3, revents=POLLERR}]) <0.000026>
1487241315.076241 recvmsg(3, 0x7ffc8e05e260, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable) <0.000022>
1487241315.076337 gettimeofday({1487241315, 76366}, NULL) = 0 <0.000020>
1487241315.076422 poll([{fd=3, events=POLLIN|POLLERR}], 1, 924) = 1 ([{fd=3, revents=POLLERR}]) <0.000025>
1487241315.076523 recvmsg(3, 0x7ffc8e05e260, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable) <0.000025>
1487241315.079770 gettimeofday({1487241315, 79799}, NULL) = 0 <0.000019>
1487241315.079855 poll([{fd=3, events=POLLIN|POLLERR}], 1, 921) = 1 ([{fd=3, revents=POLLERR}]) <0.000024>
1487241315.079956 recvmsg(3, 0x7ffc8e05e260, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable) <0.000021>
1487241315.080057 gettimeofday({1487241315, 80084}, NULL) = 0 <0.000020>
1487241315.080140 poll([{fd=3, events=POLLIN|POLLERR}], 1, 921) = 1 ([{fd=3, revents=POLLERR}]) <0.000024>
1487241315.080238 recvmsg(3, 0x7ffc8e05e260, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable) <0.000021>
1487241315.080322 gettimeofday({1487241315, 80350}, NULL) = 0 <0.000020>
1487241315.080406 poll([{fd=3, events=POLLIN|POLLERR}], 1, 920) = 1 ([{fd=3, revents=POLLERR}]) <0.000023>
1487241315.080502 recvmsg(3, 0x7ffc8e05e260, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable) <0.000019>
1487241315.080583 gettimeofday({1487241315, 80610}, NULL) = 0 <0.000018>
1487241315.080663 poll([{fd=3, events=POLLIN|POLLERR}], 1, 920) = 1 ([{fd=3, revents=POLLERR}]) <0.000024>
1487241315.080761 recvmsg(3, 0x7ffc8e05e260, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable) <0.000020>
1487241315.080843 gettimeofday({1487241315, 80870}, NULL) = 0 <0.000020>
1487241315.080925 poll([{fd=3, events=POLLIN|POLLERR}], 1, 920) = 1 ([{fd=3, revents=POLLERR}]) <0.000037>
1487241315.081037 recvmsg(3, 0x7ffc8e05e260, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable) <0.000020>
1487241315.081119 gettimeofday({1487241315, 81147}, NULL) = 0 <0.000020>


Additionally, while writing this email I just experienced on an unrelated
machine just upgraded to rc8 the `vim` used to write this email entering
seemingly endless loop of select() returning 1 followed by a dozen brk() calls.
That may or may not be related, if it is related it's worth noting that machine
is a uniprocessor box running a non-SMP kernel, an old dedicated server.

I don't know how reproducible this is... if there's anything in particular
anyone wants me to do, let me know! Not sure how long I can tolerate these
spinning processes, the X61s tends to overheat.

Regards,
Vito Caputo