Re: [openib-general] [PATCH 07/16] ehca: interrupt handling routines
From: Heiko J Schick
Date: Fri May 05 2006 - 09:06:28 EST
Hello Roland,
Roland Dreier wrote:
It seems that you are deferring completion event dispatch into threads
spread across all the CPUs. This seems like a very strange thing to
me -- you are adding latency and possibly causing cacheline pingpong.
It may help throughput in some cases to spread the work across
multiple CPUs but it seems strange to me to do this in the low-level
driver. My intuition would be that it would be better to do this in
the higher levels, and leave open the possibility for protocols that
want the lowest possible latency to be called directly from the
interrupt handler.
We've implemented this "spread CQ callbacks across multiple CPUs"
functionality to get better throughput on a SMP system, as you have
seen.
Originaly, we had the same idea as you mentioned, that it would be better
to do this in the higher levels. The point is that we can't see so far
any simple posibility how this can done in the OpenIB stack, the TCP/IP
network layer or somewhere in the Linux kernel.
For example:
For IPoIB we get the best throughput when we do the CQ callbacks on
different CPUs and not to stay on the same CPU.
In other papers and slides (see [1]) you can see similar approaches.
I think such one implementation or functionality could require more
or less a non-trivial changes. This could be also releated to other
I/O traffic.
[1]: Speeding up Networking, Van Jacobson and Bob Felderman,
http://www.lemis.com/grog/Documentation/vj/lca06vj.pdf
Regards,
Heiko
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/