Re: [PATCH v6 2/2] tracing: Add support for preempt and irq enable/disable events

From: Joel Fernandes
Date: Thu Sep 28 2017 - 22:15:39 EST


Hi Steven, Peter,

I'm planning to make the following changes for the next rev, could you
let me know if you're Ok with it?

1. Drop the stop_critical_timings changes - previous patch was
generating the preempt_enable/disable events but they aren't "real"
events. Instead since we already have cpuidle trace events, we can
just rely on those for now to understand how much time was spent in
idle. A future patch could do something smarter.

2. Drop the recursion protection from trace_preempt_enable/disable.
The trace_preempt_enable/disable calls don't nest, so there's no need
to protect it with a per-cpu variable.

3. trace_irq_enable/disable on the other hand are called in this way,
so I'll add some comments about why per-cpu variable is needed.

thanks a lot,

- Joel

On Mon, Sep 25, 2017 at 3:57 PM, Joel Fernandes <joelaf@xxxxxxxxxx> wrote:
> On Mon, Sep 25, 2017 at 3:52 AM, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
>> On Mon, 25 Sep 2017 12:32:23 +0200
>> Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>>
>>
>>> > You mean you want to trace all calls to preempt and irq off even if
>>> > preempt and irqs are already off?
>>>
>>> Sure, why not? This stuff naturally nests, and who is to say its not a
>>> useful thing to trace all of them?
>>>
>>> By also tracing the nested sections you can, for instance, see how much
>>> you'd really win by getting rid of the outer one. If, for instance, the
>>> outer only accounts for 1% of the time, while the inner ones are
>>> interlinked and span the other 99%, there's more work to do than if that
>>> were not the case.
>>
>> If we do this we need a field to record if the preemption or irqs were
>> toggled by that call. Something that filters could easily be added to
>> only show what this patch set has.
>
> I request that we please not do this for my patchset, there are a
> number of reasons in my mind:
>
> 1. trace_preempt_off in existing code is only called the first time
> preempt is turned off. This is the definition of "preempt off", its
> the first time Preemption is actually turned off, and has nothing much
> to do with going into a deeper preempt count. Whether the count
> increases or not, preempt is already off and that's confirmed by the
> first preempt off event.
>
> This is how I read it in the comments in sched/core.c as well:
> "
> * If the value passed in is equal to the current preempt count
> * then we just disabled preemption."
>
> This is how I based this patchset as well, againt its not my usecase
> and it can be a future patch if its useful to track this.
>
> 2. This stuff is already a bit trace heavy, I prefer not to generate
> event every time the preempt count increases. Ofcourse filters but
> still then we have the filtering overhead for a usecase that I am not
> really targetting with this patchset.
>
> 3. It will complicate the patch more, apart from adding filters as
> Steven suggested, it would also mean we change how
> preempt_latency_start in sched/core.c works.
>
> Do you mind if we please keep this as a 'future' usecase for a future
> patch? Its not my usecase at all for this patchset and not what I was
> intending.
>
> I will reply to Peter's other comments on the other email shortly.
>
> thanks!
>
> - Joel