Re: [RFC/PATCH] cpuset: cpuset irq affinities
From: Peter Zijlstra
Date: Fri Feb 29 2008 - 16:04:05 EST
On Fri, 2008-02-29 at 12:52 -0800, Max Krasnyanskiy wrote:
> Ingo Molnar wrote:
> > * Peter Zijlstra <a.p.zijlstra@xxxxxxxxx> wrote:
> >
> >> @@ -174,11 +174,20 @@ struct irq_desc {
> >> #ifdef CONFIG_PROC_FS
> >> struct proc_dir_entry *dir;
> >> #endif
> >> +#ifdef CONFIG_CPUSETS
> >> + struct cpuset *cs;
> >> +#endif
> >
> > i like this approach - it makes irqs more resource-alike and attaches
> > them to a specific resource control group.
> >
> > So if /cgroup/boot is changed to have less CPUs then the "default" irqs
> > move along with it.
> >
> > but if an isolated RT domain has specific irqs attached to it (say the
> > IRQ of some high-speed data capture device), then the irqs would move
> > together with that domain.
> >
> > irqs are no longer a bolted-upon concept, but more explicitly managed.
> >
> > [ If you boot-test it and if Paul agrees with the general approach then
> > i could even apply it to sched-devel.git ;-) ]
>
> Believe it or not but I like it too :).
> Now we're talking different approach compared to the cpu_isolated_map since
> with this patch cpu_system_map is no longer needed.
> I've been playing with latest sched-devel tree and while I think we'll endup
> adding a lot more code, doing it with the cpuset is definitely more flexible.
> This way we can provide more fine grain control of what part of the "system"
> services are allowed to run on a cpuset. Rather that "catch all" system flag.
>
> Current sched-devel tree does not provide complete isolation at this point.
> There are still many things here and there that need to be added/fixed.
> Having finer control here helps.
>
> One concern I have is that this API conflicts with /proc/irq/X/smp_affinity.
> ie Setting smp_affinity manually will override affinity set by the cpuset.
> In other words I think
> int irq_set_affinity(unsigned int irq, cpumask_t cpumask)
> now needs to make sure that cpumask does not have cpus that do not belong to
> the cpuset this irq belongs to. Just like sched_setaffinity() does for the tasks.
The patch also needs to handle group destruction too; currently it
leaves cpuset pointers dangling. So it would either have to refuse
removing a group when there are still irqs associated, or move them to
the parent.
But yeah, this was just a quick hack to show the idea, glad you like it.
Will try to flesh it out a bit in the coming week.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/