Re: [BUG] 4.10-rc8 - ping spinning?

From: lkml
Date: Thu Feb 16 2017 - 06:17:50 EST


On Thu, Feb 16, 2017 at 03:03:03AM -0800, lkml@xxxxxxxxxxx wrote:
> Hello list,
>
> Some rtl8192cu bugs of old got me in the habit of running ping in a shelved (i.e. forgotten) xterm, a harmless practice which seemed to prevent the rtl8192cu device from dying.
>
> This evening the system started getting very slow and to my surprise I found
> this in `top`:
> 5115 swivel 30 10 14772 1928 1756 R 90.9 0.0 1351:41 ping
> 9005 swivel 30 10 14772 1892 1724 R 90.9 0.0 1354:26 ping
>
> This is a dual core machine (X61s, core2duo 1.8Ghz), those processes are
> burning all the free CPU in the system context. They're identical commands,
> just plain `ping domain.com`, to the same host. It appears I accidentally
> (fortuitously?) had two running, which made this event more interesting.
>
> I can assert that these did not begin spinning simultaneously - as you can see
> by the cumulative time in `top` there's a small delta. I also use a window
> manager with builtin continuous process monitoring, and when I noticed this was
> happening I was able to see that one of the processes had only recently begun
> spinning, the other was spinning long enough to have its start fall off the
> chart (at least ~17 minutes ago).
>
> This hasn't occurred before AFAIK, but I haven't spent much time in 4.10 yet.
> I'm pretty confident this didn't happen in 4.9 which I ran for quite a while.
>
> `strace` of one of the aforementioned processes:
>
> 1487241315.073568 poll([{fd=3, events=POLLIN|POLLERR}], 1, 927) = 1 ([{fd=3, revents=POLLERR}]) <0.000022>
> 1487241315.073665 recvmsg(3, 0x7ffc8e05e260, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable) <0.000020>
> 1487241315.073747 gettimeofday({1487241315, 73774}, NULL) = 0 <0.000021>
> 1487241315.073829 poll([{fd=3, events=POLLIN|POLLERR}], 1, 927) = 1 ([{fd=3, revents=POLLERR}]) <0.000025>
> 1487241315.073927 recvmsg(3, 0x7ffc8e05e260, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable) <0.000020>
> 1487241315.074024 gettimeofday({1487241315, 74050}, NULL) = 0 <0.000256>
> 1487241315.074352 poll([{fd=3, events=POLLIN|POLLERR}], 1, 927) = 1 ([{fd=3, revents=POLLERR}]) <0.000026>
> 1487241315.076241 recvmsg(3, 0x7ffc8e05e260, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable) <0.000022>
> 1487241315.076337 gettimeofday({1487241315, 76366}, NULL) = 0 <0.000020>
> 1487241315.076422 poll([{fd=3, events=POLLIN|POLLERR}], 1, 924) = 1 ([{fd=3, revents=POLLERR}]) <0.000025>
> 1487241315.076523 recvmsg(3, 0x7ffc8e05e260, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable) <0.000025>
> 1487241315.079770 gettimeofday({1487241315, 79799}, NULL) = 0 <0.000019>
> 1487241315.079855 poll([{fd=3, events=POLLIN|POLLERR}], 1, 921) = 1 ([{fd=3, revents=POLLERR}]) <0.000024>
> 1487241315.079956 recvmsg(3, 0x7ffc8e05e260, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable) <0.000021>
> 1487241315.080057 gettimeofday({1487241315, 80084}, NULL) = 0 <0.000020>
> 1487241315.080140 poll([{fd=3, events=POLLIN|POLLERR}], 1, 921) = 1 ([{fd=3, revents=POLLERR}]) <0.000024>
> 1487241315.080238 recvmsg(3, 0x7ffc8e05e260, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable) <0.000021>
> 1487241315.080322 gettimeofday({1487241315, 80350}, NULL) = 0 <0.000020>
> 1487241315.080406 poll([{fd=3, events=POLLIN|POLLERR}], 1, 920) = 1 ([{fd=3, revents=POLLERR}]) <0.000023>
> 1487241315.080502 recvmsg(3, 0x7ffc8e05e260, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable) <0.000019>
> 1487241315.080583 gettimeofday({1487241315, 80610}, NULL) = 0 <0.000018>
> 1487241315.080663 poll([{fd=3, events=POLLIN|POLLERR}], 1, 920) = 1 ([{fd=3, revents=POLLERR}]) <0.000024>
> 1487241315.080761 recvmsg(3, 0x7ffc8e05e260, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable) <0.000020>
> 1487241315.080843 gettimeofday({1487241315, 80870}, NULL) = 0 <0.000020>
> 1487241315.080925 poll([{fd=3, events=POLLIN|POLLERR}], 1, 920) = 1 ([{fd=3, revents=POLLERR}]) <0.000037>
> 1487241315.081037 recvmsg(3, 0x7ffc8e05e260, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable) <0.000020>
> 1487241315.081119 gettimeofday({1487241315, 81147}, NULL) = 0 <0.000020>
>

It's worth noting that ping is still otherwise functioning correctly, despite
the POLLERR:

1487242826.169502 sendmsg(3, {msg_name(16)={sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("xx.xxx.xxx.xxx")}, msg_iov(1)=[{"\10\0\245G\23\373\243uJ\206\245X\0\0\0\0\352\225\2\0\0\0\0\0\20\21\22\23\24\25\26\27"..., 64}], msg_controllen=0, msg_flags=0}, MSG_CONFIRM) = 64 <0.000133>
1487242826.169757 recvmsg(3, {msg_name(16)={sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("66.240.222.126")}, msg_iov(1)=[{"E \0T\345\364\0\0002\1w\23B\360\336~\n\0\0\23\0\0\255G\23\373\243uJ\206\245X"..., 192}], msg_controllen=32, {cmsg_len=32, cmsg_level=SOL_SOCKET, cmsg_type=0x1d /* SCM_??? */, ...}, msg_flags=0}, 0) = 84 <0.028825>
1487242826.198697 write(1, "64 bytes from xxxxxxx.com (xx.xx"..., 79) = 79 <0.000639>
1487242826.199405 gettimeofday({1487242826, 199430}, NULL) = 0 <0.000023>
1487242826.199486 poll([{fd=3, events=POLLIN|POLLERR}], 1, 970) = 1 ([{fd=3, revents=POLLERR}]) <0.000026>
1487242826.199578 recvmsg(3, 0x7ffc8e05e260, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable) <0.000024>

Surprisingly ping doesn't seem to be reacting to the POLLERR though it
requested it. Maybe this is just a ping bug? Though I haven't seen this
before rc8.

Regards,
Vito Caputo