Re: Fwd: [PATCH] x86_64: typo in __assign_irq_vector when update pos for vector and offset

From: Eric W. Biederman
Date: Mon Oct 16 2006 - 14:08:01 EST


"Yinghai Lu" <yinghai.lu@xxxxxxx> writes:

> Please check the patch.

It looks good. I think I was having a bad day when I introduced those thinkos.
I'm a bit distracted at the moment so I don't trust myself to forward
the patch on. Hopefully later today.

> Also I have a question about TARGET_CPUS in io_apic.c.
>
> for a 16 sockets system with 32 non coherent ht chain. and if every
> chain have 8 irq for devices, the genapic will use physflat. and it
> should use you new added "different cpu can have same vector for
> different irq". --- in the i8259.c
>
> but the setup_IOAPIC_irqs and arch_setup_ht_irq and arch_setup_msi_irq
> is still using TARGET_CPUS ( it is cpumask_of_cpu(0) for physflat), so
> the assign_irq_vector will not get vector for them, becase cpu 0 does
> not have that much vector to be alllocated. and later
> setup_affinity_xxx_irq can not be used because before irq is not there
> and show on /proc/interrupts.

Yep, that is a bug. I hadn't realized real systems that exhausted the
number of vectors were quite so close. That totally explains your
interest. That is one I/O heavy development system you have YH.

Unless there is a reason not to, TARGET_CPUS should be cpu_online map.
I don't know why it is restricted to a single cpu. I'm curious how
that would affect the user space irq balancer.

> So I want to
> 1. for arch_setup_ht_irq and arch_setup_msi_irq, we can use the dev it
> takes to get bus and use bus->sysdata to get bus->node mapping that is
> created in fillin_cpumask_to_bus, to get real target_cpus instead of
> cpu0.

That sounds like a very reasonable approach.

> 2. for ioapics, may need to add another array,
> ioapic_node[MAX_IOAPICS], and use ioapic address to get the numa node
> for it. So later can use it to get real targets cpus when need to use
> TARGET_CPUS.

Yes. Numa affinity for the ioapics seems to make sense. I don't
think we can count on looking at the address though? Are your
ioapics pci devices? If so that would be the easiest way to
deal with this problem.

> Please comments.

So far that is the job of the user space irqbalancer, but I don't see
why the kernel can't setup some reasonable defaults, if we have all of
the information needed.

For 2.6.19 we should be able to get my typos fixed, and probably
the default mask increased so that we are given a choice of something
other than cpu 0.

Beyond that it is going to take some additional working and thinking
and so it probably makes sense to have the code sit in the -mm
or Andi's tree for a while, and let it mature for 2.6.20.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/