Re: BUG: rcu_sched detected stalls on CPUs

From: mitko
Date: Sat Jan 21 2017 - 02:20:10 EST


On 2017-01-20 17:19, Steven Rostedt wrote:
On Fri, 20 Jan 2017 10:43:50 +0200
mitko@xxxxxxxxxx wrote:

[1.] One line summary of the problem:

rcu_sched detected stalls on CPUs and few minutes server not respond.

Is this reproducible? Or was this a one time ordeal?

It's happened usual once per day and can't be reproducible. If server load average is more then
0.70 it's will happened twice per day.



[2.] Full description of the problem/report:

Load of my server (postgres database) isnt big less then 0.50 and when
error occured rcu_sched detected stalls on CPUs
server freeze and nothing is work for 3-5 minute.
No network, no video signal, no keyboard, no mouse. Nothing is worked.
After these few minutes everything continue normal.
This usual is happend once per day. When I check in google find a lots
of ppl complain of this error, but no solution.
Do any one know can help me to resolve it ? I spoke with few friends and
they trying to convince me the problem is in CPU.
I did not believe after a 3 years working CPU suddenly stop working
correctly, but I might be wrong.

[3.] Keywords (i.e., modules, networking, kernel):

kernel

[4.] Kernel information
[4.1.] Kernel version (from /proc/version):

Linux version 4.4.38 (root@hive64) (gcc version 5.4.0 (GCC) ) #2 SMP Sun
Dec 11 16:11:02 CST 2016


Have you tried a newer version of the kernel?

No, for the moment I didnt because I can't find new one in slackware repos.
My last kernel compile was almost 7 years ago and I do not feel safe now, I guess a lot of things have changed.



-- Steve

Regards,
Mitko