Re: [PATCH] x86_64: Limit the number of processor bootup messages

From: Mike Travis
Date: Mon Nov 02 2009 - 15:32:19 EST




Ingo Molnar wrote:
* Mike Travis <travis@xxxxxxx> wrote:


Andi Kleen wrote:
Mike Travis wrote:
This set of patches limits the number of repetitious messages which contain
no additional information. Much of this information is obtainable from the
/proc and /sysfs. Most of the messages are also sent to the kernel log
buffer as KERN_DEBUG messages so it can be used to examine more closely any
details specific to a processor.
What would be good is to put the information from the booting CPUs
into some buffer and print it visibly if there's a timeout detected on the BP.
What do you think of this idea.... Add a "mark kernel log buffer" function, and then if any KERN_NOTE or above happens, it sends the marked info from the kernel log buffer to the console before the current message. Set the marker to '0' to clear.

That's _way_ too complex really, for little benefit. (If there's a boot hang people will re-try anyway (and this time with a serial console attached or so), and they can add various boot options to increase verbosity - depending in which phase the bootup hung.)

I'm ok with this, though generally speaking large server systems have
serial consoles attached, and save the output into admin logs. One
problem with just setting the loglevel high enough to output debug
messages, is you get literally 100's of thousands of lines of meaningless
information. We waited over 8 hours for a system with 2k cpus to boot
in debug mode, and it never made it all the way up.

My intention for the above was to attempt to print debug information
that pertains to the failure, and not everything else.


So please go with the simple solution i suggested days ago: print stuff on the boot CPU but after that only a single line per AP CPU.

Ingo

So you think printing 4096 lines provides meaningful additional
information? I would think at least compress it so you only print
each new processor socket boots and not the 16 threads each of
them have?

I should have timing information soon for 512 cores/1024 threads and
printing a single line for each of those will significantly increase
the time it takes to boot.

Thanks,
Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/