Re: IRQ affinities (was: boot cgroup questions)

From: Peter Zijlstra
Date: Fri May 09 2008 - 07:49:18 EST


On Fri, 2008-05-09 at 06:17 -0500, Paul Jackson wrote:
> Peter wrote:
> > That's a new feature; and its quite common that new features require
> > code changes.
>
> It's common for new features to require code changes to take advantage
> of the new features.
>
> It's less desirable that taking advantage of such new features breaks
> existing, basically unrelated, code.
>
> My gut sense is that, in a misguided effort to find a "simple" answer
> to irq distribution, we (well, y'all) are trying to attach this
> feature to cpusets or cgroups.
>
> Let me ask a different question:
>
> What solutions would you (Max, Peter, Ingo, lurkers, ...) be
> suggesting for this 'IRQ affinity' problem if cpusets and
> cgroups didn't exist in any form whatsoever?
>
> The answer to that question might help me contribute to this discussion
> in another way ... it might help me understand better what we're really
> trying to do here. You guys were proposing mechanisms that don't fit
> my architecture sense of cpusets, but I was having problems figuring out
> what are the essential underlying requirements, independent of choice
> of mechanism.
>
> Perhaps by describing one or two possible alternative, cpuset-free,
> mechanisms that come more or less close to meeting our needs, I will
> glean a better understanding of these elusive requirements, and can
> better contribute to the discussion of design trade offs facing us.
>
> So could you describe some possible cpuset-free solutions? If they are
> flawed in some critical way, that's ok, just point out said flaw(s).
> Either way, this could help illuminate what's needed here.
>
> It might be, once I better understand the requirements, possible
> solutions and their tradeoffs, that I come to agree that cpusets or
> cgroups present the best mechanism, given the tradeoffs and what's
> needed. Or it might be we find a better way to meet our needs.
>
> Actually, if for no other reason than to bring any lurkers up to speed,
> if you (Max or Peter, likely) wanted to describe, from the beginning,
> what this discussion is about, that would be good too. I doubt anyone
> outside of three or four of us even recalls that long discussion of
> February and March, 2008.


I see two use-cases:

- Isolation
- NUMA node devices

With isolation you want to move all of you 'normal' system tasks off to
side of your machine and use the other side for 'special - rt' tasks.

For IRQs this means that you want to move all the 'normal' IRQs along
with the 'normal' tasks, and move the special IRQs into the rt side.

Of course you can do this by setting IRQ affinities one by one, but
being able to group the IRQs seems a sensible thing to me.

One thing here is that we'd like to also provide a default group for new
IRQs, so that when a new device appears its not allowed into the
'special' side of your machine.

This is what Max focussed on, and provides a binary devision of your
machine: special and not special.


Now I was thinking that if we generalize this whole thing it might be
useful for other purposes such as IRQ placement near the nodes that host
the device and/or the application using them.


So what we'd end up with is named affinity groups that contain (unique)
IRQs.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/