Re: [PATCH v3 01/15] x86/dumpstack: Optimize save_stack_trace

From: Byungchul Park
Date: Tue Sep 13 2016 - 10:54:25 EST


On Tue, Sep 13, 2016 at 10:18 PM, Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote:
> On Tue, Sep 13, 2016 at 06:45:00PM +0900, Byungchul Park wrote:
>> Currently, x86 implementation of save_stack_trace() is walking all stack
>> region word by word regardless of what the trace->max_entries is.
>> However, it's unnecessary to walk after already fulfilling caller's
>> requirement, say, if trace->nr_entries >= trace->max_entries is true.
>>
>> I measured its overhead and printed its difference of sched_clock() with
>> my QEMU x86 machine. The latency was improved over 70% when
>> trace->max_entries = 5.
>
> This code will (probably) be obsoleted soon with my new unwinder.

Hello,

You are right.

I also think this will probably be obsoleted with yours.
So I didn't modify any details of the patch.
I will take your comment into account if it becomes necessary.

Anyway, crossrelease needs this patch to work smoothly.
That's only reason why I included this patch in the thread.

Thank you,
Byungchul

> Also, my previous comment was ignored:
>
> Instead of adding a new callback, why not just check the ops->address()
> return value? It already returns an error if the array is full.
>
> I think that would be cleaner and would help prevent more callback
> sprawl.
>
> --
> Josh



--
Thanks,
Byungchul