Re: [GIT PULL v2] printk: Make it usable on nohz cpus

From: Frederic Weisbecker
Date: Sat Dec 08 2012 - 17:50:56 EST


2012/12/8 Ingo Molnar <mingo@xxxxxxxxxx>:
>
> * Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:
>
>> Ingo,
>>
>> Please pull the printk support in dynticks mode patches that can
>> be found at:
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git tags/printk-dynticks-for-mingo-v2
>>
>> for you to fetch changes up to 74876a98a87a115254b3a66a14b27320b7f0acaa:
>>
>> printk: Wake up klogd using irq_work (2012-11-18 01:01:49 +0100)
>>
>> It is based on v3.7-rc4.
>>
>> Changes since previous pull request include support for irq
>> work flush on CPU offlining and acks from Steve. The rest
>> hasn't changed except some comment fix.
>
> Since this changes kernel/printk.c it needs Linus's ack.
>
> I looked through the older submissions but found no good summary
> of these changes: it would be nice if you could write up a good
> high level description of these changes - why has printk based
> kernel message logging become problematic on nohz, what are the
> symptoms to users, and what are the solution alternatives you
> found and please justify the irq_work extension variant you
> picked.
>
> I more or less accept the fact that fixes are needed here, but
> the linecount appears a bit high. It's possibly unavoidable, but
> would be nice to have a discussion of it, as printk is something
> we really, really want to keep as simple as possible.

Sorry, I focused too much on details and it indeed lacks a detailed
general explanation.

printk() is problematic for the full dynticks implementation that we
are working on because it depends on the tick to stay periodic in
order to wake up the potential readers sleeping on the syslog()
syscall.

printk() can be called about anywhere (sort of), so when a new message
is written in the buffer while there are pending readers, printk wakes
them up asynchronously instead of doing it in place in order to avoid
some gory locking scenarios with the scheduler locks.

This asynchronous wake up is handled by a hook on the timer tick:
printk_tick(). This is called every timer interrupt. So if we stop the
tick outside idle, and some printk() is called while the CPU runs in
full dynticks mode while we have pending readers, those may not be
woken for a while. Hence the user may miss some important message.

So if we want to enter in full dynticks mode safely, I fear we don't
have much choices. We need to find some event driven rather than
periodic driven way to perform the asynchronous wake up. The irq work
subsystem is a very good fit for that because:

* it can typically raise self-IPIs (some archs haven't implemented
that yet but this will be a requirement for full dynticks mode)
* it's light, simple and standalone. This is what we want for an API
used by printk()
* this is lockless, which is a requirement for printk at this stage.

Also irq work subsystem has its own hook on the timer tick in case the
arch can't perform self-IPIs. So if the tick is not stopped while
printk() is called, we can even avoid the IPI. This may be desirable
if we have situations with lots of printk() in a short period of time:
this can avoid an IPI storm.

Also this is an immediate benefit for mainline because we can now
remove the ad-hoc printk_tick() timer hook. One less function call and
per-cpu check from the timer interrupt is a win.

What do you guys think?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/