Re: getting processor numbers

From: Paul Jackson
Date: Tue Apr 03 2007 - 22:05:25 EST


Andrew wrote:
> I'd have thought that in general an application should be querying its
> present affinity mask - something like sched_getaffinity()? That fixes the
> CPU hotplug issues too, of course.

The sched_getaffinity call is quick, too, and it nicely reflects any
cpuset constraints, while still working on kernels which don't have
CPUSETs configured.

There are really at least four "number of CPUs" answers here, and we
should be aware of which we are providing. There are, in order of
decreasing size:
1) the size of the kernels cpumask_t (NR_CPUS),
2) the maximum number of CPUs that might ever be hotplugged into a
booted system,
3) the current number of CPUs online in that system, and
4) the number of CPUs that the current task is allowed to use.

I would suggest that (4) is what we should typically return.
Certainly it would seem that the use that Ulrich is concerned with,
by OpenMP, wants (4).

Currently, the sysconf(_SC_NPROCESSORS_CONF) returns (3), by counting
the CPUs in /proc/stat, which is rather bogus on cpuset, or even
sched_setaffinity, constrained systems.

> But we discussed this all a couple years back and it was decided that
> sched_getaffinity() was unsuitable. I remember at the time not really
> understanding why?

Perhaps it was because a robust invocation of sched_getaffinity takes
a page of code to write, as Andi Kleen noticed in his libnuma coding of
"static int number_of_cpus(void)". One has to size the mask passed in,
in case the kernel was compiled with a larger cpumask_t size than you
guessed up front. In other words, to get (4) using sched_getaffinity,
one first needs an upper bound on (1), above, the kernels configured
NR_CPUS.

One can either size it by repeatedly invoking sched_getaffinity with
larger masks until it stops failing EINVAL, or one can examine the
length of the "Cpus_allowed" mask displayed in /proc/self/status
(it takes 9 ascii chars, if you include the commas and trailing
newline, to display each 32 bits of cpumask_t.) There may be other
ways as well; those seem to be the most common.

At least the kernel cpumask_t size can be cached for the life of the
process and any descendents, so the cost of obtaining it should be less
critical.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@xxxxxxx> 1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/