Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

From: Paul Jackson
Date: Thu Jun 29 2006 - 18:22:16 EST


Shailabh wrote:
> The idea of collecting stats for a group of cpus rather than all
> (or one) seems attractive. But cpusets doesnt :-)

Ok ... ;).

If you think yet another cpu grouping mechanism is needed, I'm not
the unbiased neutral part to say it's not needed.

However a static grouping does not seem to fit the actual usage
patterns that we see, at least on our (unusally large) Altix
systems.

At least in the usage we see, people run various sized, independent
jobs on a system, using cpusets to define the cpu and memory containers
holding those jobs. Much of what they do is naturally divided along
those job boundaries, so they want the ability to dynamically size
other resource management and tracking facilities along the same
boundaries.

One job might want to trace a data stream with no data loss, even if
it means slowing the job down. Another job might want to collect what
it can with limited collecting resources, and let the bits fall where
they will. A third job might want to increase the data collection
resources sufficiently to collect alot of data while not slowing the
job down. One job might have very high fork/exit rates, and another
very low.

If the collectors are grouped along natural job boundaries, there might
not be any need to combine multiple streams, hence no need for the
timestamps you mention. Cpusets are perhaps the best surrogate for
these boundaries. Cpusets are hierarchical, so it would be convenient
to have a single collector for a large group of jobs.

It may well be that you find cpusets unattactive for this use for
good reason. Or perhaps you find them unattractive here just out of
unfamilarity or misunderstanding. Before introducing yet another
grouping mechanism, we should have an explanation of why the current
mechanism(s) are unsuitable. Hopefully an explanation slightly more
elaborate than "doesn't seem attactive" ;).

(Hmmm ... I hope I don't end up regretting asking the question "why
do cpusets suck for this ...?")

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@xxxxxxx> 1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/