[PATCH] irq: sparse irq_desc[] support
From: Yinghai Lu
Date: Sat Nov 29 2008 - 02:13:03 EST
Impact: new CONFIG_SPARSE_IRQ feature, which makes irq_desc[] a sparse array
To support kernels with very large NR_CPUS and NR_IRQS settings,
we need to reduce the size of irq_desc[]. On x86, when NR_CPUS is
set to 4096, the irq_desc[] array will waste megabytes of RAM,
which is not acceptable overhead to generic distro kernels.
In v2.6.28 we already introduced a generic API to make access to
the irq_desc[] array more abstract - and to allow a different
data structure to underly it. This patch finishes that process.
Core kernel changes:
- fix missing sparseirq API changes in various bits of core kernel code
(missing for_irq_desc primitives, missing checks for !desc, etc.)
- introduce a new data type in the IRQ code: irq_desc_ptrs[] and its
handling in the core IRQ code
- detach the IRQ statistics counters from kernel_stat and
attach it to irq_desc->kstat_irqs[] dynamically allocated
array of pointers. (this can use percpu_alloc() in the
future, once percpu_alloc() becomes generic enough)
- detach the NR_IRQS array in random.c.
- interrupt remapping: when moving an IRQ on NUMA, reallocate the irq
descriptor so that we get proper NUMA-local memory for the descriptor,
for the irq_cfg entry and for the kstat_irqs array.
Architectures can enable this by setting the CONFIG_SPARSE_IRQ
config switch. The x86 architecture is extended/fixed to deal
with such an irq_desc[] model:
- io_apic irq_cfg[NR_IRQS] array is re-attached to desc->irq_chip
- MSI virtual IRQ numbering is sanitized to go from the max upper
end of the physical IRQ range up towards NR_IRQS - instead of
coming down from the end of NR_IRQS.
- re-tunes our max NR_IRQS calculations
Architectures that do not specify CONFIG_SPARSE_IRQ, do not need
to change anything - this is a transparent feature that is not
supposed to break any existing code.
Signed-off-by: Yinghai Lu <yinghai@xxxxxxxxxx>
Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/