Re: [PATCH] sysfs: add per pci device msi[x] irq listing (v3)

From: Bjorn Helgaas
Date: Thu Sep 29 2011 - 00:41:14 EST


On Wed, Sep 28, 2011 at 6:42 PM, Neil Horman <nhorman@xxxxxxxxxxxxx> wrote:
>
> On Wed, Sep 28, 2011 at 04:18:55PM -0600, Bjorn Helgaas wrote:
> > On Thu, Sep 22, 2011 at 8:32 AM, Neil Horman <nhorman@xxxxxxxxxxxxx> wrote:
> > >
> > > On Thu, Sep 22, 2011 at 07:54:28AM -0600, Matthew Wilcox wrote:
> > > > On Mon, Sep 19, 2011 at 11:47:15AM -0400, Neil Horman wrote:
> > > > > So a while back, I wanted to provide a way for irqbalance (and other apps) to
> > > > > definitively map irqs to devices, which, for msi[x] irqs is currently not really
> > > > > possible in user space.  My first attempt wen't not so well:
> > > > > https://lkml.org/lkml/2011/4/21/308
> > > > >
> > > > > It was plauged by the same issues that prior attempts were, namely that it
> > > > > violated the one-file-one-value sysfs rule.  I wandered off but have recently
> > > > > come back to this.  I've got a new implementation here that exports a new
> > > > > subdirectory for every pci device,  called msi_irqs.  This subdirectory contanis
> > > > > a variable number of numbered subdirectories, in which the number represents an
> > > > > msi irq.  Each numbered subdirectory contains attributes for that irq, which
> > > > > currently is only the mode it is operating in (msi vs. msix).  I think fits
> > > > > within the constraints sysfs requires, and will allow irqbalance to properly map
> > > > > msi irqs to devices without having to rely on rickety, best guess methods like
> > > > > interface name matching.
> > > >
> > > > This approach feels like building bigger rockets instead of a space
> > > > elevator :-)
> > > >
> > > In which case your comments make me think that you're trying to build the
> > > Death Star instead of buying more tie fighters :)
> > > https://docs.google.com/viewer?url=http://www.dau.mil/pubscats/ATL%20Docs/Sep-Oct11/Ward.pdf
> > >
> > > > What we need is to allow device drivers to ask for per-CPU interrupts,
> > > > and implement them in terms of MSI-X.  I've made a couple of stabs at
> > > > implementing this, but haven't got anything working yet.  It would solve
> > > Yes, IIRC you were trying to do this the first time I proposed this:
> > > https://lkml.org/lkml/2011/4/21/315
> > >
> > > > a number of problems:
> > > >
> > > Thats great, I don't see how this precludes what I'm trying to do here.  All
> > > this patch does is expose a definitive relationship between msi irqs and the pci
> > > devices that allocate them.  The kernel internal model used to allocate msi
> > > interrupts can change, the kobject creation and removal just has to change with
> > > it (presumably to create and destroy the msi irq kobjects when the individual
> > > irqs are allocated/freed, rather than in a batch).  I don't see why we should
> > > block enhancements to the existing msi implementation until you get new model
> > > sorted, especially when this feature works equally well, despite the model we
> > > use internally.
> >
> > Matthew, I don't understand this issue well enough to know whether
> > Neil's patch would get in the way of your planned enhancements, or
> > whether it would be baggage we won't want to maintain forever.  As far
> > as I can tell, the patch exposes an (IRQ -> device) mapping, which
> > would still be meaningful even with per-CPU interrupts.  Can you
> > educate me?
> >
> Thats my view on the subject, to which I think I commented.  Matthews
> enhancements are perfectly reasonable, but they're orthogonal to these changes.
> Regardless of the way they're allocated (matthews changes), theres still an
> association between the irq and the device (my changes)
>
> > Neil, why do you propose doing this just for MSI IRQs?  I would think
> > it'd be useful information for *all* IRQs, regardless of type, and
> > that exposing the mapping for all IRQs would make it easier for tools.
> >
> Because legacy (non-msi) irqs are already ostensibly exposed via
> /proc/bus/pci/devices/.../irq.  So non-msi irqs are already covered.

But that's a different mechanism, in a different directory hierarchy.
It seems like it could be easier for user-space if all types of IRQs
were exposed uniformly in sysfs, even if we had the leftover /proc/
stuff that only covers non-MSI IRQs. I guess one could argue that we
shouldn't have non-MSI IRQs in both places, since we can never remove
the /proc stuff anyway.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/