Re: [PATCH] ioat: fix tasklet tear down

From: Thomas Gleixner
Date: Thu Feb 20 2014 - 05:30:56 EST


B1;3202;0cOn Wed, 19 Feb 2014, Dan Williams wrote:

> Since commit 77873803363c "net_dma: mark broken" we no longer pin dma
> engines active for the network-receive-offload use case. As a result
> the ->free_chan_resources() that occurs after the driver self-test no
> longer has a NET_DMA induced ->alloc_chan_resources() to back it up. A
> late firing irq can lead to ksoftirqd spinning indefinitely due to the
> tasklet_disable() performed by ->free_chan_resources(). Only
> ->alloc_chan_resources() can clear this condition in affected kernels.
>
> This problem has been present since commit 3e037454bcfa "I/OAT: Add
> support for MSI and MSI-X" in 2.6.24, but is now exposed. Given the
> NET_DMA use case is deprecated we can revisit moving the driver to use
> threaded irqs. For now, just tear down the irq and tasklet properly by:

Right, moving to threaded irqs would get rid of the whole tasklet
mess.

> 1/ Disable the irq from triggering the tasklet
>
> 2/ Disable the irq from re-arming
>
> 3/ Flush inflight interrupts
>
> 4/ Flush the timer
>
> 5/ Flush inflight tasklets
>
> References:
> https://lkml.org/lkml/2014/1/27/282
> https://lkml.org/lkml/2014/2/19/672
>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxx>
> Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
> Cc: <stable@xxxxxxxxxxxxxxx>
> Reported-by: Mike Galbraith <bitbucket@xxxxxxxxx>
> Reported-by: Stanislav Fomichev <stfomichev@xxxxxxxxxxxxxx>
> Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

Reviewed-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>

As Mike pointed out tsi721_free_chan_resources() has the same issue.

I did a quick scan of all tasklet_disable() sites. The teardown or
similar wreckage is available in:

drivers/atm/he.c
drivers/dma/at_hdmac.c
drivers/dma/pch_dma.c
drivers/input/keyboard/omap-keypad.c
drivers/isdn/gigaset/interface.c
drivers/media/pci/mantis/mantis_dvb.c
drivers/mmc/host/s3cmci.c
drivers/net/ethernet/jme.c
drivers/net/ethernet/silan/sc92031.c
drivers/net/usb/r8152.c
drivers/net/wireless/mwl8k.c
drivers/ntb/ntb_hw.c
drivers/rapidio/devices/tsi721_dma.c
drivers/s390/crypto/ap_bus.c
drivers/spi/spi-pl022.c
drivers/staging/cxt1e1/linux.c
drivers/staging/ozwpan/ozhcd.c
drivers/usb/gadget/fsl_qe_udc.c

That's 18 of 30 usage sites. Impressive....

We need to poke the relevant maintainers to get this solved.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/