Re: PCI device driver broken between 4.2 and 4.3

From: Bjorn Helgaas
Date: Tue Feb 02 2016 - 11:13:13 EST


On Tue, Feb 02, 2016 at 08:04:39AM +0300, ÐÐÐÐ ÐÐÑÐÐ wrote:
> it looks much better with pci=routeirq
>
> [ 100.896723] *Before pci_enable_device IRQ 20*
> [ 100.896735] *After pci_enable_device IRQ 20*
> [ 100.896745] *Before pci_enable_device IRQ 21*
> [ 100.896752] *After pci_enable_device IRQ 21*

If pci=routeirq makes a difference, it usually means your driver is
looking at dev->irq before it calls pci_enable_device(). I looked at
what I think is your driver
(https://github.com/qmor/elcus-1553-driver-linux/blob/master/driver/tmk1553.c),
but I didn't *see* that problem. It does use a highly unconventional
strategy of calling pci_get_device() to locate devices, instead of
using pci_register_driver() like normal drivers do.

You should not have to use pci=routeirq, and I've even considered
removing the option.

> On Monday 01 of February 2016 15:08:23 Bjorn Helgaas wrote:
> > [+cc Yinghai]
> >
> > On Mon, Feb 01, 2016 at 08:18:35AM +0300, ÐÐÐÐ ÐÐÑÐÐ wrote:
> > > Okay. I've started from driver level printk
> > > results are:
> > >
> > > On 4.2
> > >
> > > [414006.575989] Before pci_enable_device IRQ 20
> > >
> > > [414006.575991] After pci_enable_device IRQ 20
> > >
> > > [414006.575997] Before pci_enable_device IRQ 21
> > >
> > > [414006.575999] After pci_enable_device IRQ 21
> > >
> > > on 4.3
> > >
> > > [ 114.862289] Before pci_enable_device IRQ 5
> > >
> > > [ 114.862303] After pci_enable_device IRQ 5
> > >
> > > [ 114.862316] Before pci_enable_device IRQ 5
> > >
> > > [ 114.862326] After pci_enable_device IRQ 5
> > >
> > > I've got two cards, because of that pci_enable_device() calls twice.
> >
> > Did you try booting with pci=routeirq as Yinghai suggested? That's
> > not a fix, but if it does make things work, it may give us an idea for
> > how to fix it correctly.
> >
> > > On Friday 29 of January 2016 10:31:59 Bjorn Helgaas wrote:
> > > > On Thu, Jan 28, 2016 at 10:28:14PM +0300, ÐÐÑÐÐ ÐÐÐÐ wrote:
> > > > > What i need to print out at first order?
> > > >
> > > > Jiang, can you chime in here?
> > > >
> > > > 991de2e59090 is related to IRQs, so I'd start by printing dev->irq in
> > > > your
> > > > driver before and after you call pci_enable_device(). Add some printks
> > > > in
> > > > pcibios_alloc_irq() and pcibios_enable_device() just to confirm that we
> > > > got
> > > >
> > > > there and when, e.g., add lines like this:
> > > > dev_info(&dev->dev, "%s\n", __func__);
> > > >
> > > > Bjorn
> > > >
> > > > > 27 ÑÐÐ. 2016 Ð. 16:22 ÐÐÐÑÐÐÐÐÑÐÐÑ Bjorn Helgaas <helgaas@xxxxxxxxxx>
> > >
> > > ÐÐÐÐÑÐÐ:
> > > > > > On Wed, Jan 27, 2016 at 12:38:06PM +0300, ÐÐÑÐÐ ÐÐÐÐ wrote:
> > > > > > > Also, my drive has no
> > > > > > >
> > > > > > > pcibios_enable_device()
> > > > > > > pcibios_alloc_irq()
> > > > > > >
> > > > > > > calls.
> > > > > >
> > > > > > Those are internal interfaces used by the PCI core. Drivers
> > > > > > shouldn't
> > > > > > call them directly. Drivers normally call pci_enable_device(), and
> > > > > > those internal interfaces are used in that path.
> > > > > >
> > > > > > > 26.01.2016 22:05, ÐÐÐÐ ÐÐÑÐÐ ÐÐÑÐÑ:
> > > > > > > >I confirmed it works in
> > > > > > > >
> > > > > > > >890e4847587f
> > > > > > > >
> > > > > > > >and do not works in
> > > > > > > >
> > > > > > > >991de2e59090
> > > > > > > >
> > > > > > > >26.01.2016 18:32, Bjorn Helgaas ÐÐÑÐÑ:
> > > > > > > >>[+cc Jiang]
> > > > > > > >>
> > > > > > > >>On Mon, Jan 25, 2016 at 03:52:51PM -0600, Bjorn Helgaas wrote:
> > > > > > > >>>Hi ÐÐÐÐ,
> > > > > > > >>>
> > > > > > > >>>On Sun, Jan 24, 2016 at 04:50:08PM +0300, ÐÐÐÐ ÐÐÑÐÐ wrote:
> > > > > > > >>>>Okay. I've sent logs (dmesg and lspci) from both 4.2 and 4.3
> > > > > > > >>>>to bugzilla
> > > > > > > >>>
> > > > > > > >>>I don't see anything wrong in either log. Both v4.2 and v4.3
> > > > > > > >>>enumerate the device the same way, and the driver seems to
> > > > > > > >>>claim it
> > > > > > > >>>
> > > > > > > >>>the same way:
> > > > > > > >>> pci 0000:0d:00.0: [10b5:9030] type 00 class 0x078000
> > > > > > > >>> pci 0000:0d:00.0: reg 0x14: [io 0x2100-0x217f]
> > > > > > > >>> pci 0000:0d:00.0: reg 0x18: [io 0x2380-0x239f]
> > > > > > > >>> pci 0000:0d:00.0: PME# supported from D0 D3hot
> > > > > > > >>> pci 0000:0d:01.0: [10b5:9030] type 00 class 0x078000
> > > > > > > >>> pci 0000:0d:01.0: reg 0x14: [io 0x2180-0x21ff]
> > > > > > > >>> pci 0000:0d:01.0: reg 0x18: [io 0x23a0-0x23bf]
> > > > > > > >>> pci 0000:0d:01.0: PME# supported from D0 D3hot
> > > > > > > >>> pci 0000:0d:02.0: [10b5:9030] type 00 class 0x078000
> > > > > > > >>> pci 0000:0d:02.0: reg 0x14: [io 0x2200-0x227f]
> > > > > > > >>> pci 0000:0d:02.0: reg 0x18: [io 0x2280-0x22ff]
> > > > > > > >>> pci 0000:0d:02.0: reg 0x1c: [io 0x2300-0x237f]
> > > > > > > >>> pci 0000:0d:02.0: PME# supported from D0 D3hot
> > > > > > > >>>
> > > > > > > >>> sja1000_plx_pci 0000:0d:02.0: Detected "Eclus CAN-200-PCI"
> > > > > > > >>>
> > > > > > > >>>card at slot #2
> > > > > > > >>>
> > > > > > > >>> sja1000_plx_pci 0000:0d:02.0: Channel #1 at
> > > > > > > >>>
> > > > > > > >>>0x0000000000012280, irq 22 registered as can0
> > > > > > > >>>
> > > > > > > >>> sja1000_plx_pci 0000:0d:02.0: Channel #2 at
> > > > > > > >>>
> > > > > > > >>>0x0000000000012300, irq 22 registered as can1
> > > > > > > >>>
> > > > > > > >>> sja1000_plx_pci 0000:0d:02.0 can0: setting BTR0=0x03
> > > > > > > >>> BTR1=0x37
> > > > > > > >>>
> > > > > > > >>>One option is always to bisect between v4.2 and v4.3 to see
> > > > > > > >>>which
> > > > > > > >>>commit made it stop working. See
> > > > > > > >>>https://git-scm.com/docs/git-bisect
> > > > > > > >>
> > > > > > > >>Jiang, ÐÐÐÐ bisected this to 991de2e59090 ("PCI, x86: Implement
> > > > > > > >>pcibios_alloc_irq() and pcibios_free_irq()").
> > > > > > > >>
> > > > > > > >>ÐÐÐÐ, please double-check and confirm that 890e4847587f works
> > > > > > > >>and
> > > > > > > >>991de2e59090 fails.
> > > > > > > >>
> > > > > > > >>Then please add some printks in the pcibios_enable_device() and
> > > > > > > >>pcibios_alloc_irq() paths and in your driver to see exactly what
> > > > > > > >>changed
> > > > > > > >>between 890e4847587f and 991de2e59090
> > > > > > > >>
> > > > > > > >>Bjorn
> > > > > > > >>
> > > > > > > >>>>23.01.2016 17:54, Bjorn Helgaas ÐÐÑÐÑ:
> > > > > > > >>>>>[+cc linux-kernel]
> > > > > > > >>>>>
> > > > > > > >>>>>Hi ÐÐÐÐ,
> > > > > > > >>>>>
> > > > > > > >>>>>On Sat, Jan 23, 2016 at 1:08 AM, ÐÐÐÐ ÐÐÑÐÐ
> > > > > > > >>>>>
> > > > > > > >>>>><oleg.moroz@xxxxxxxxxxxxx> wrote:
> > > > > > > >>>>>>Hello. I've got a device driver for MIL-1553b card
> > > > > > > >>>>>>called TA1-PCI, which
> > > > > > > >>>>>>could be found at
> > > > > > > >>>>>>https://github.com/qmor/elcus-1553-driver-linux
> > > > > > > >>>>>>Card is using PLX_PCI9030 PCI controller.
> > > > > > > >>>>>>Today i've found that this driver compiles, installes,
> > > > > > > >>>>>>but is not working as
> > > > > > > >>>>>>it should.
> > > > > > > >>>>>>Looks like it not receives any interrupts from PCI. I've
> > > > > > > >>>>>>test it again with
> > > > > > > >>>>>>kernel
> > > > > > > >>>>>>4.2 and it works okay. What changes was made in PCI
> > > > > > > >>>>>>subsystem from 4.2 to
> > > > > > > >>>>>>4.3
> > > > > > > >>>>>>which could have impact this driver work.
> > > > > > > >>>>>
> > > > > > > >>>>>Thank you very much for this problem report. There were many
> > > > > > > >>>>>PCI
> > > > > > > >>>>>changes between v4.2 and v4.3, and without more information,
> > > > > > > >>>>>I
> > > > > > > >>>>>can't
> > > > > > > >>>>>guess what might be causing this problem.
> > > > > > > >>>>>
> > > > > > > >>>>>I opened a bug report at
> > > > > > > >>>>>https://bugzilla.kernel.org/show_bug.cgi?id=111211
> > > > > > > >>>>>
> > > > > > > >>>>>Please attach complete dmesg logs for both v4.2 and v4.3 to
> > > > > > > >>>>>that
> > > > > > > >>>>>bug
> > > > > > > >>>>>report. Also, please attach the complete "lspci -vv" output
> > > > > > > >>>>>(as
> > > > > > > >>>>>root).
> > > > > > > >>>>>
> > > > > > > >>>>>Thanks!
> > > > > > > >>>>>
> > > > > > > >>>>>Bjorn
> > >
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> > > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > > More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> --
> Ð ÑÐÐÐÐÐÐÐÐ,
> ÐÐÐÐ ÐÐÑÐÐ
> ÐÐÐÐÑÑÐÑÐÐÑ ÐÐÑÐÐÑÐÐÐÐ ÐÑÐÐÐÐ ÑÐÐÑÐÐÐÑÐÐ ÐÐ ÐÐÐ ÐÐ
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html