Re: [PATCH V2 3/3] perf: Optimize sched_task() in a context switch
From: Namhyung Kim
Date: Wed Dec 02 2020 - 09:42:06 EST
Hi Peter and Kan,
On Tue, Dec 01, 2020 at 06:29:03PM +0100, Peter Zijlstra wrote:
> On Mon, Nov 30, 2020 at 11:38:42AM -0800, kan.liang@xxxxxxxxxxxxxxx wrote:
> > From: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
> >
> > Some calls to sched_task() in a context switch can be avoided. For
> > example, large PEBS only requires flushing the buffer in context switch
> > out. The current code still invokes the sched_task() for large PEBS in
> > context switch in.
>
> I still hate this one, how's something like this then?
> Which I still don't really like.. but at least its simpler.
>
> (completely untested, may contain spurious edits, might ICE the
> compiler and set your pets on fire if it doesn't)
I've tested Kan's v2 patches and it worked well. Will test your
version (with the fix in the other email) too.
>
> And given this is an optimization, can we actually measure it to improve
> matters?
I just checked perf bench sched pipe result. Without perf record
running, it usually takes less than 7 seconds. Note that this (and
below) is a median value of 10 runs.
# perf bench sched pipe
# Running 'sched/pipe' benchmark:
# Executed 1000000 pipe operations between two processes
Total time: 6.875 [sec]
6.875700 usecs/op
145439 ops/sec
And I ran it again with perf record like below. This is a result when
I applied the patch 1 and 2 only.
# perf record -aB -c 100001 -e cycles:pp perf bench sched pipe
# Running 'sched/pipe' benchmark:
# Executed 1000000 pipe operations between two processes
Total time: 8.198 [sec]
8.198952 usecs/op
121966 ops/sec
[ perf record: Woken up 10 times to write data ]
[ perf record: Captured and wrote 4.972 MB perf.data ]
With patch 3 applied, the total time went down a little bit.
# perf record -aB -c 100001 -e cycles:pp perf bench sched pipe
# Running 'sched/pipe' benchmark:
# Executed 1000000 pipe operations between two processes
Total time: 7.785 [sec]
7.785119 usecs/op
128450 ops/sec
[ perf record: Woken up 12 times to write data ]
[ perf record: Captured and wrote 4.622 MB perf.data ]
Thanks,
Namhyung