Re: [PATCH] tty: vt: Fix soft lockup in fbcon cursor blink timer.
From: Pavel Machek
Date: Thu May 19 2016 - 03:08:15 EST
On Thu 2016-05-19 08:27:37, Ming Lei wrote:
> On Thu, May 19, 2016 at 4:24 AM, Scot Doyle <lkml14@xxxxxxxxxxxxx> wrote:
> > On Wed, 18 May 2016, Ming Lei wrote:
> >> On Wed, May 18, 2016 at 4:49 AM, Pavel Machek <pavel@xxxxxx> wrote:
> >> > On Tue 2016-05-17 11:41:04, David Daney wrote:
> >> >> From: David Daney <david.daney@xxxxxxxxxx>
> >> >>
> >> >> We are getting somewhat random soft lockups with this signature:
> >> >>
> >> >> [ 86.992215] [<fffffc00080935e0>] el1_irq+0xa0/0x10c
> >> >> [ 86.997082] [<fffffc000841822c>] cursor_timer_handler+0x30/0x54
> >> >> [ 87.002991] [<fffffc000810ec44>] call_timer_fn+0x54/0x1a8
> >> >> [ 87.008378] [<fffffc000810ef88>] run_timer_softirq+0x1c4/0x2bc
> >> >> [ 87.014200] [<fffffc000809077c>] __do_softirq+0x114/0x344
> >> >> [ 87.019590] [<fffffc00080af45c>] irq_exit+0x74/0x98
> >> >> [ 87.024458] [<fffffc00080fac20>] __handle_domain_irq+0x98/0xfc
> >> >> [ 87.030278] [<fffffc000809056c>] gic_handle_irq+0x94/0x190
> >> >>
> >> >> This is caused by the vt visual_init() function calling into
> >> >> fbcon_init() with a vc_cur_blink_ms value of zero. This is a
> >> >> transient condition, as it is later set to a non-zero value. But, if
> >> >> the timer happens to expire while the blink rate is zero, it goes into
> >> >> an endless loop, and we get soft lockup.
> >> >>
> >> >> The fix is to initialize vc_cur_blink_ms before calling the con_init()
> >> >> function.
> >> >>
> >> >> Signed-off-by: David Daney <david.daney@xxxxxxxxxx>
> >> >> Cc: stable@xxxxxxxxxxxxxxx
> >> >
> >> > Acked-by: Pavel Machek <pavel@xxxxxx>
> >>
> >> Tested-by: Ming Lei <ming.lei@xxxxxxxxxxxxx>
> >>
> >> Thanks David and Pavel for making it work!
> >>
> >> >
> >> > (And it is amazing how many problems configurable blink speed caused).
> >> >
> >> > Thanks!
> >> > Pavel
> >> >
> >
> >
> > Dann, Ming and David, thank you so much for all of your effort.
> >
> > There were three other reports in the past year, each leading to their own
> > patch, of boot lockups occuring when the cursor flash timer was set using
> > an ops->cur_blink_jiffies value of 0. I plan to propose a patch within
> > the next day that will prevent this for all code paths.
>
> Given this issue caues system unusable, I suggest to merge David's
> oneline patch first, then you can think and try to figure out 'perfect' solution
> for addressing all this kind of reports from last year.
Actually, I'd merge
[PATCH] fbcon: use default if cursor blink interval is not valid
first. That one is obviously safe. Nice big overkill, but safe. Then
nicer solution can be attempted...
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html