Re: call for more SD versus CFS comparisons (was: Re: [ck] Mainline plans)

From: Martin Steigerwald
Date: Tue Jun 12 2007 - 12:53:20 EST


Am Dienstag 12 Juni 2007 schrieb Miguel Figueiredo:
> Hi all,
>
> some results based on massing_intr.c by Satoru, can be found on
> http://people.redhat.com/mingo/cfs-scheduler/tools/massive_intr.c

Hi Miquel, Ingo, Con!

I have been a week without internet access. I have been testing 2.6.21.3 +
sws2 2.2.10 and cfs-v15 and ck2 on both my T42 and my Amarok machine T23
[1].

From subjective feedback I can't tell a difference anymore. But I usually
do not play 3D games or use Beryl / Compiz (except for showing it off).

I did not yet test with disabled NEED_RESCHED CFS workaround as Ingo asked
my in private mail. This is a work-around to fix the audio skipping
problems on my Amarok machine. I will compile a CFS v16 and test with
this work-around disabled.

I had one issue with CFS v15 at least, but its not easy to reproduce. I
usually test Amarok playback while kernel package make (kernel compile),
opening a lot of KDE apps (Konqueror mainly) and moving the Amarok window
like mad. After it I close the windows of that KDE apps again. For
Konqueror I can use the menu entry "Close all instances" since I have 20+
of them open. When I do this, all windows will be closed and I
occassionally get some audio skips there. AFAIR it didn't happen with SD
there on my tests with it.

One issue on SD I had was some audio skips *directly* after resume (with
suspend2). I did not see these with CFS-v15.

Both are issues that are not easy to trackdown, test reproducably for me.

I am willing to run other tests on my Amarok machine. Since I can't tell a
subjective difference anymore this likely involved running some
benchmarks. Test would be whether sound playback is 100% OK during
running that benchmark.

To make it comparative it would be nice if a certain test producedure is
done on several machines by several people. What would be good test
scenarios?

I am interested in desktop loads, and would like stuff like this:

1) Generating high load on the X server.

2) Starting loads of applications and stopping them again quickly. Maybe
its possible to simulate my manually starting KDE apps by clicking around
in the startbar wildly and stopping them again (to track this CFS issue).

3) Starting one or more computational expensive tasks like kernel
compiles.

4) Putting load onto the VM layer / block layer. I am not sure about this.
We are testing CPU schedulers, not IO schedulers, but different VM
scenarious are still important IMHO.

These are specially interesting IMHO when being combined. I tried to
simulate this with my manual point and click testings. (Kernel compile +
starting apps + moving Amarok window like mad.)

My main concerns during those tests are:

1) Is audio playback 100% okay? Instead of only hearing whether there are
glitches there possible is a way to *measure* them? Could be extended to
video playback tough I would expect glitches at a certain load... (well
at some point eventually audio play back will stop too)

2) Is the mouse pointer always respsonsive? (When there is something I do
not like it is a frozen mouse pointer as I had a lot with earlier CFS
versions previous to above workaround). Maybe this can be measured too?

Is that the massive_intr testing?

3) How is interactivity? How long does it take till an applications react
to a mouse click... but then how on earth do you measure this?

I guess there are other important criterias to have a look - especially
also the behavior on a server for example - how does the scheduler behave
on a webserver with many clients issuing differently sized requests?

Any suggestions?

Now what would be a testing procedure that is affordable for testers and
as relevant to scheduler testing as it could be? My problem with using
benchmarks is that I first have to spent lots of time to figure out how
they work and how to produce a reasonable setup. If someone - preferably
Linux kernel scheduler experts - invent a testing procedure I could
follow step by step and then post the results it would be easier for me
to collect some useful informations for you Kernel developers.

Without any more extensive testing aside from above mentioned difficult to
reproduce artifacts in my subjective perception SD (as in 2.6.21-ck2) and
CFS-v15 are on par now. So based on that other criteria - code size,
design criteria, maintainership, responding to bug reports, further
benchmarks and real world usage tests - would need to be used to decide
which scheduler should go into the kernel.

As I read several times SD is smaller regarding to code size. OTOH to me
it seems that Ingo Molnar has more time and ability to support
maintainership, but still to what I have seen Con did his best here too
and fixed all known issues with SD.

[1] http://martin-steigerwald.de/amarok-machine/

Regards,
--
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7

Attachment: signature.asc
Description: This is a digitally signed message part.