Re: [PATCH] sparse_irq aka dyn_irq v13

From: Mike Travis
Date: Thu Nov 13 2008 - 19:15:33 EST


David Miller wrote:
> From: Mike Travis <travis@xxxxxxx>
> Date: Thu, 13 Nov 2008 15:11:29 -0800
>
>> David Miller wrote:
>>> From: Paul Mackerras <paulus@xxxxxxxxx>
>>> Date: Fri, 14 Nov 2008 09:19:13 +1100
>>>
>>>> Andrew Morton writes:
>>>>
>>>>> Other architectures want (or have) sparse interrupts. Are those guys
>>>>> paying attention here?
>>>> On powerpc we have a mapping from virtual irq numbers (in the range 0
>>>> to NR_IRQS-1) to physical irq numbers (which can be anything) and back
>>>> again. I think our approach is simpler than what's being proposed
>>>> here, though we don't try to keep the irqdescs node-local as this
>>>> patch seems to (fortunately our big systems aren't so NUMA-ish as to
>>>> make that necessary).
>>> This is exactly what sparc64 does as well, same as powerpc, and
>>> as Paul said it's so much incredibly simpler than the dyn_irq stuff.


>> One problem is that pre-defining a static NR_IRQ count is almost always
>> wrong when the NR_CPUS count is large, and should be adjusted as resources
>> require.
>
> We use a value of 256 and I've been booting linux on 128 cpu sparc64
> systems with lots of PCI-E host controllers (and others have booted it
> on even larger ones). All of which have several NUMA domains.
>
> It's not an issue.

Are you saying that having a fixed count of IRQ's is not an issue? With
NR_CPUS=4096 what would you fix it to? (Currently it's NR_CPUS * 32
but that might not be sufficient.) Would NR_CPUS=16384 make it an issue?

>
>> Large UV systems will take a performance hit from off-node accesses
>> when the CPU count (or more likely the NODE count) reaches some
>> threshold. So keeping as much interrupt context close to the
>> interrupting source is a good thing.
>
> Just because the same piece of information is repeated over and
> over again doesn't mean it really matters.

Which information is repeated over and over? I was under the
impression that each and every interrupt writes to the irq_desc
entry for that irq? If this is in a big list on node 0, that is
data passing over the system bus.

Or am I missing what you're getting at?

Thanks,
Mike



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/