Re: [RFC/PATCH] cpuset: cpuset irq affinities

From: Paul Jackson
Date: Fri Feb 29 2008 - 16:54:24 EST


Peter wrote:
> IRQ are hardware notifiers and
> do all kinds of things depending on the hardware.

So some of these irqs might have very particular node affinity then,
right? If my thingamajig board is attached to node 72, then I might
want its interrupts going to a CPU on node 72, right?

In which case, putting that irq in my boot cpuset that only has nodes
0-3 would be harmful to the performance of my thingamajig board, right?

I'm suspecting here that you don't want a small 'boot' cpuset
(usually a small cpuset running legacy Unix stuff) holding the irqs,
but rather you want a big 'system' cpuset, which has -all-but- a few
nodes dedicated to hard real time or other isolated (there's that
word again) purposes.

That way, most irqs can go to most CPUs, depending on their specific
needs.

Unfortunately, I don't think the cpuset hierarchy and conventions admit
of both a big 'system' cpuset (all but a few isolated nodes) and a small
overlapping 'boot' cpuset.

> We don't support mv on cgroups, right? To easily create
> common parents...

The only mv supported is simple rename, preserving parentage.

And if one could and did a tree reshaping mv near the top of the
hierarchy, it would confuse the heck out of existing uses and users.

> I might just be new-fangled, but I have a /cgroup mount.
> but I guess that's just different mount-point of cgroup, right?

All cgroups mount beneath /cgroup. For backwards compatibility,
one can also "mount -t cpuset cpuset /dev/cpuset", and just get
the cpuset interface, with a couple of legacy hooks to make it behave
just like good old fashioned cpusets, rather than new fangled cgroups.

> I'm fine with whatever, I saw a ',' in the bitmap stuff, not really sure
> how that ended up being a ' ' in the patch I send out... :-)

Yes - that's another commonly supported form. If that's a better
presentation, then you'd probably want to rework your code, to take
in and display the entire vector of irq numbers in one line, using
a comma-separated list of irqs and ranges of irqs.

See further bitmap_scnprintf(), bitmap_parse_user(),
bitmap_scnlistprintf() and bitmap_parselist(), in bitmap.c.

Given that you don't have an pre-existing bitmap of irqs (that I know
of) and that you might have a distinct error code for each irq that you
try to attach to a different cpuset, I'm guessing you want to stick
with the single irq per write on input, single irq per line on output,
paradigm, similar to what the 'tasks' file uses for task pids.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@xxxxxxx> 1.940.382.4214
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/