Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

From: Jay Lan
Date: Thu Jun 29 2006 - 17:54:27 EST


Shailabh Nagar wrote:
Paul Jackson wrote:

Shailabh wrote:


I suppose this is because cpuset's offer some middle ground between collecting data per-cpu vs. collecting it for all cpus ?


Yes - well said. And I have this strange tendency to see all the
worlds problems as opportunities for cpuset solutions <grin>.



What happens when someone is using cpusets on such a machine and
changes its membership in response to other needs. All taskstats
users would need to monitor for such changes and adjust their
processing....seems like unnecessary tying up of two unrelated
concepts.


I would not expect taskstat users to monitor for such changes.
I'd expect them to monitor the stats from whatever is in the
cpuset they named. If a task moves out of that cpuset to another,
then tough -- that task will no longer be monitored by that
particular monitoring request.

Cpusets do provide a convenient middle ground, as you say, which
is really useful for reducing scaling issues such as this one to
a managable size.

Per-cpu is too fine grained, and per-system too coarse.

An unnecessary tying - yes. But perhaps a useful one.


The idea of collecting stats for a group of cpus rather than all (or one) seems attractive.
But cpusets doesnt :-)

How about if we did something simple like
having a separate listen group (within genetlink) for a reasonably large number of cpus
and have all the messages from those cpus multicast to the listeners of that group alone ?

e.g. currently we have only one TASKSTATS_LISTEN_GROUP
we could reserve the following
TASKSTATS_LISTEN_GROUP_0
TASKSTATS_LISTEN_GROUP_1....

where GROUP_0 handles cpus numbered 0-63 (or 31)....etc.

Advantages would be

1. Most users would still need to listen to the one group as they do
in the current design and others could listen to more, scaling up their userspace listening daemons
as appropriate (e.g. one daemon per listening group).

2. Userspace could be saved the bother of having too many streams of per-cpu data and reassemble them
in the order they were generated.

The moment we talk of splitting up the data stream generated by the kernel I suppose we have to do some
kind of timestamping so reassembly in the same order can be done. I can't see this mattering for the likes of
delay accounting and CSA but for future taskstats users, who knows.

Timestamp of the taskstats messages or timestamp of the exiting task?
I include an exit_time field for the task as part of "Common
Accounting Fields" in my csa_taskstats patch i sent to you. So, we
have both start_time and exit_time.

Thanks,
- jay



--Shailabh



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/