Re: [PATCH 0/4] Finer granularity and task/cgroup irq timeaccounting

From: Peter Zijlstra
Date: Tue Aug 24 2010 - 08:48:08 EST


On Tue, 2010-08-24 at 17:08 +0530, Balbir Singh wrote:
> * Peter Zijlstra <peterz@xxxxxxxxxxxxx> [2010-08-24 11:09:13]:

> > The whole attribution mess can only be solved by actually splitting out
> > the entries that do work, like per-cgroup workqueue threads and similar
> > things.
> >
> > System wide entities like IRQs are very hard to attribute correctly like
> > Martin already argued, and I don't think its worth doing.
>
> I see Martin's view point, is the suggestion then that we amortize
> these costs across all tasks?

I'm still not sure what you want them for, but if its for wanting to
know wth the system is up to, simply account them on their own, and not
include them in any task stats.

That is, keep the existing hi/si interface and improve upon that, but
also subtract those times from the task execution times.

That way, if a cpu is like 80% hogged by IRQ action, you'll not see a
100% busy task, but only a 20%.

At that point you can also feed the IRQ time back into
sched_rt_avg_update() (which strictly speaking isn't rt but !fair), and
the load-balancer will automagically try and move tasks away from that
cpu.

If you really want to account (and possibly control) all the work
belonging to a particular group you'll have to make sure work does
indeed stay within the group -- which is where per-cgroup workqueue
threads and per-cgroup softirq threads etc. come into play.

Lumping all work together and then trying to extract something again is
silly.

And hardirq time really is system time, not cgroup or task time.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/