Re: [PATCH try 5] CFS: Add hierarchical tree-based penalty.

From: Con Kolivas
Date: Tue Oct 12 2010 - 06:27:27 EST


On Tue, 12 Oct 2010 20:47:35 Ingo Molnar wrote:
> * William Pitcock <nenolod@xxxxxxxxxxxxxxxx> wrote:
> > Hi,
> >
> > ----- "Ingo Molnar" <mingo@xxxxxxx> wrote:
> > > * William Pitcock <nenolod@xxxxxxxxxxxxxxxx> wrote:
> > > > Inspired by the recent change to BFS by Con Kolivas, this patch
> > >
> > > causes
> > >
> > > > vruntime to be penalized based on parent depth from their root task
> > > >
> > > > group.
> > > >
> > > > I have, for the moment, decided to make it a default feature since
> > >
> > > the
> > >
> > > > design of CFS ensures that broken applications depending on task
> > > > enqueue behaviour behaving traditionally will continue to work.
> > >
> > > Just curious, is this v5 submission a reply to Peter's earlier review
> > > of
> > > your v3 patch? If yes then please explicitly outline the changes you
> > > did
> > > so that Peter and others do not have to guess about the direction your
> > >
> > > work is taking.
> >
> > I just did that in the email I just sent. Simply put, I was talking
> > with Con a few weeks ago about the concept of having a maximum amount
> > of service for all threads belonging to a process. This did not work
> > out so well, so Con proposed penalizing based on fork depth, which
> > still allows us to maintain interactivity with make -j64 running in
> > the background.
> >
> > Actually, I lie: it works great for server scenarios where you have
> > some sysadmin also running azureus. Azureus gets penalized instead,
> > but other apps like audacious get penalized too.
>
> Thanks for the explanation!
>
> Ingo

It's a fun feature I've been playing with that was going to make it into the
next -ck, albeit disabled by default. Here's what the patch changelog was
going to say:

---
Make it possible to have interactivity and responsiveness at very high load
levels by having a hierarchical tree based penalty. This is achieved by
making deadlines offset by the fork depth from init. This has a similar effect
to 'nice'ing loads that are fork heavy (such as 'make'), and biases CPU and
latency towards threaded desktop applications.

When a new process is forked, its fork depth is inherited from its parent
across fork() and then is incremented by one. That fork_depth is then used
to cause a relative offset of its deadline. Threads keep the same fork_depth
as their parent process as these tend to belong to threaded desktop apps.

Using a dual core machine as an example, and running the "browser benchmark"
at http://service.futuremark.com/peacekeeper/index.action shows the effect
this patch has.

The benchmark runs a number of different browser based workloads, and gives
a score in points, where higher is better.

Running the benchmark under various different loads with the feature enabled/
disabled:

Load Disabled Enabled
None 2437 2437
make -j2 1642 2293
make -j24 208 2187
make -j42 failed 1626

As can be seen, on the dual core machine, a load of 2 makes the benchmark run
almost precisely 1/3 slower as would be expected with BFS' fair CPU
distribution of 3 processes between 2 CPUs. Enabling this feature makes this
benchmark progress almost unaffected at this load, and only once the load is
more than 20 times higher does it hinder the benchmark to the same degree.

Other side effects of this patch are that it weakly partitions CPU entitlement
to different users, and provides some protection against fork bombs.

Note that this drastically affects CPU distribution, No assumption as to CPU
distribution should be made based on past behaviour. It can be difficult to
apportion a lot of CPU to a fork heavy workload with this enabled, and the
effects of 'nice' are compounded.

Unlike other approaches to improving latency under load of smaller timeslices,
enabling this feature has no detrimental effect on throughput under load.

This feature is disabled in this patch by default as it may lead to unexpected
changes in CPU distribution and there may be real world regressions.

There is a sysctl to enable/disable this feature in
/proc/sys/kernel/fork_depth_penalty


--
-ck
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/