Re: [PATCH 1/2] panic: ratelimit panic messages

From: Andrew Morton
Date: Wed Jan 05 2011 - 21:18:41 EST

On Wed, 5 Jan 2011 21:05:12 -0500 Don Zickus <dzickus@xxxxxxxxxx> wrote:

> > We keep on hacking away at this and things never seem to get much
> > better. It's still the case that a large number of our oops reports
> > are damaged because the important parts of the oops trace scrolled off
> > the screen.
> >
> > I therefore propose
> >
> > oops_lines_delay=N,M
> >
> > which will cause the kernel to pause for M milliseconds after emitting
> > N lines of oops output. Bonus marks for handling linewrap!
> >
> > Start the line counter at oops_begin() or thereabouts and then do the
> > delay after N lines have been emitted. I guess that counter should
> > _not_ be invalidated in oops_end(): if the oops generates 12 lines and
> > then another 100 lines of random printk crap are printed, we still want
> > to put a pause after the 13th line of that random crap, so we can view
> > the oops.
> >
> > The oops_lines_delay implemetnation should count lines from all CPUs
> > and should block all CPUs during the delay.
> >
> > I think this would solve the problem which you're seeing, as well as
> > the much larger my-oops-scrolled-off problem?
> Ok. Forgive me for being thick. I seem to be lost in the lower layer of
> the oops code for some reason. I understand your idea and am willing to
> take a crack at implementing it,

ooh, goody. I'd finally have an answer to all those useless cellphone
jpegs of scrolled-off oops traces.

> I just can't figure out what function to
> stick it in. I grep'd for oops_begin() and it seemed to be an x86-only
> thing. Is there a more generic place to put this stuff?

Sorry, braino. oops_enter() and oops_exit(). All architectures
_should_ be calling oops_enter() right at the start of the oops
handling and oops_exit() right at the end. They're suitable sites for
adding new stuff.

