Re: [RFC 2/3] preempt_tracer: Disable IRQ while starting/stopping due to a preempt_counter change

From: Daniel Bristot de Oliveira
Date: Wed May 29 2019 - 09:55:03 EST


On 29/05/2019 12:20, Peter Zijlstra wrote:
> On Wed, May 29, 2019 at 11:40:34AM +0200, Daniel Bristot de Oliveira wrote:
>> On 29/05/2019 10:33, Peter Zijlstra wrote:
>>> On Tue, May 28, 2019 at 05:16:23PM +0200, Daniel Bristot de Oliveira wrote:
>>>> The preempt_disable/enable tracepoint only traces in the disable <-> enable
>>>> case, which is correct. But think about this case:
>>>>
>>>> ---------------------------- %< ------------------------------
>>>> THREAD IRQ
>>>> | |
>>>> preempt_disable() {
>>>> __preempt_count_add(1)
>>>> -------> smp_apic_timer_interrupt() {
>>>> preempt_disable()
>>>> do not trace (preempt count >= 1)
>>>> ....
>>>> preempt_enable()
>>>> do not trace (preempt count >= 1)
>>>> }
>>>> trace_preempt_disable();
>>>> }
>>>> ---------------------------- >% ------------------------------
>>>>
>>>> The tracepoint will be skipped.
>>>
>>> .... for the IRQ. But IRQs are not preemptible anyway, so what the
>>> problem?
>>
>>
>> right, they are.
>>
>> exposing my problem in a more specific way:
>>
>> To show in a model that an event always takes place with preemption disabled,
>> but not necessarily with IRQs disabled, it is worth having the preemption
>> disable events separated from IRQ disable ones.
>>
>> The main reason is that, although IRQs disabled postpone the execution of the
>> scheduler, it is more pessimistic, as it also delays IRQs. So the more precise
>> the model is, the less pessimistic the analysis will be.
>
> I'm not sure I follow, IRQs disabled fully implies !preemptible. I don't
> see how the model would be more pessimistic than reality if it were to
> use this knowledge.

Maybe I did not expressed myself well... and the example was not good either.

"IRQs disabled fully implies !preemptible" is a "to big" step. In modeling (or
mathematical reasoning?), a good practice is to break the properties into small
piece, and then build more complex reasoning/implications using these "small
properties."

Doing "big steps" makes you prone "miss interpretations", creating ambiguity.
Then, -RT people are prone to be pessimist, non-RT optimistic, and so on... and
that is what models try to avoid.

For instance, explaining this using words is contradictory:>
> Any !0 preempt_count(), which very much includes (Hard)IRQ and SoftIRQ
> counts, means non-preemptible.

One might argue that, the preemption of a thread always takes place with
preempt_count() != 0, because __schedule() is always called with preemption
disabled, so the preemption takes place while in non-preemptive.

A more elaborated example:

------------------ %< --------------------------
Thread A is running, and goes to sleep waiting for a timer...
schedule() {
preempt_disable();
__schedule() {
smp_apic_timer_interrupt() {
sched_wakeup (Thread A);
sched_wakeup (Thread B: highest prio) {
sched_set_need_resched();
}
}
local_irq_disable()
context switch to B, A leaves in state=R.
------------------ %< --------------------------

In this case, the thread A suffered a "preemption" with "!0 preempt_count()"

The fact is, Linux does not fit straight in the "well known terminology" of
academic papers because many of those terminology bases in the fact that
operations are atomic. But they are not and Linux has some behaviors that
desires new terminology/interpretation...

- WAIT But you (daniel) wants to fake the atomicity between preempt_disable and
its tracepoint!

Yes, I do, but this is a very straightforward step/assumption: the atomicity is
about the real-event and the tracepoint that notifies it. It is not about two
different events.

That is why it is worth letting the modeling rules to clarify the behavior of
system, without doing non-obvious implication in the code part, so we can have a
model that fits better in the Linux actions/events to avoid ambiguity.

[ note 1: the tracepoint is only enabled if CONFIG_PREEMPTIRQ_TRACEPOINTS=y
which is not enabled by default ]

[ note 2: I just saw that Steven replied while I was writing this email... I
will read them now... sorry for some repetitive topic here ]

-- Daniel