[PATCH 3.13.y-ckt 016/136] EDAC: Robustify workqueues destruction

From: Kamal Mostafa
Date: Tue Feb 02 2016 - 14:04:15 EST


3.13.11-ckt34 -stable review patch. If anyone has any objections, please let me know.

---8<------------------------------------------------------------

From: Borislav Petkov <bp@xxxxxxx>

commit fcd5c4dd8201595d4c598c9cca5e54760277d687 upstream.

EDAC workqueue destruction is really fragile. We cancel delayed work
but if it is still running and requeues itself, we still go ahead and
destroy the workqueue and the queued work explodes when workqueue core
attempts to run it.

Make the destruction more robust by switching op_state to offline so
that requeuing stops. Cancel any pending work *synchronously* too.

EDAC i7core: Driver loaded.
general protection fault: 0000 [#1] SMP
CPU 12
Modules linked in:
Supported: Yes
Pid: 0, comm: kworker/0:1 Tainted: G IE 3.0.101-0-default #1 HP ProLiant DL380 G7
RIP: 0010:[<ffffffff8107dcd7>] [<ffffffff8107dcd7>] __queue_work+0x17/0x3f0
< ... regs ...>
Process kworker/0:1 (pid: 0, threadinfo ffff88019def6000, task ffff88019def4600)
Stack:
...
Call Trace:
call_timer_fn
run_timer_softirq
__do_softirq
call_softirq
do_softirq
irq_exit
smp_apic_timer_interrupt
apic_timer_interrupt
intel_idle
cpuidle_idle_call
cpu_idle
Code: ...
RIP __queue_work
RSP <...>

Signed-off-by: Borislav Petkov <bp@xxxxxxx>
Signed-off-by: Kamal Mostafa <kamal@xxxxxxxxxxxxx>
---
drivers/edac/edac_device.c | 11 ++++-------
drivers/edac/edac_mc.c | 14 +++-----------
drivers/edac/edac_pci.c | 9 ++++-----
3 files changed, 11 insertions(+), 23 deletions(-)

diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c
index 592af5f..5358737 100644
--- a/drivers/edac/edac_device.c
+++ b/drivers/edac/edac_device.c
@@ -435,16 +435,13 @@ void edac_device_workq_setup(struct edac_device_ctl_info *edac_dev,
*/
void edac_device_workq_teardown(struct edac_device_ctl_info *edac_dev)
{
- int status;
-
if (!edac_dev->edac_check)
return;

- status = cancel_delayed_work(&edac_dev->work);
- if (status == 0) {
- /* workq instance might be running, wait for it */
- flush_workqueue(edac_workqueue);
- }
+ edac_dev->op_state = OP_OFFLINE;
+
+ cancel_delayed_work_sync(&edac_dev->work);
+ flush_workqueue(edac_workqueue);
}

/*
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 33edd67..19dc0bc 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -584,18 +584,10 @@ static void edac_mc_workq_setup(struct mem_ctl_info *mci, unsigned msec,
*/
static void edac_mc_workq_teardown(struct mem_ctl_info *mci)
{
- int status;
-
- if (mci->op_state != OP_RUNNING_POLL)
- return;
-
- status = cancel_delayed_work(&mci->work);
- if (status == 0) {
- edac_dbg(0, "not canceled, flush the queue\n");
+ mci->op_state = OP_OFFLINE;

- /* workq instance might be running, wait for it */
- flush_workqueue(edac_workqueue);
- }
+ cancel_delayed_work_sync(&mci->work);
+ flush_workqueue(edac_workqueue);
}

/*
diff --git a/drivers/edac/edac_pci.c b/drivers/edac/edac_pci.c
index 2cf44b4d..b4b3860 100644
--- a/drivers/edac/edac_pci.c
+++ b/drivers/edac/edac_pci.c
@@ -274,13 +274,12 @@ static void edac_pci_workq_setup(struct edac_pci_ctl_info *pci,
*/
static void edac_pci_workq_teardown(struct edac_pci_ctl_info *pci)
{
- int status;
-
edac_dbg(0, "\n");

- status = cancel_delayed_work(&pci->work);
- if (status == 0)
- flush_workqueue(edac_workqueue);
+ pci->op_state = OP_OFFLINE;
+
+ cancel_delayed_work_sync(&pci->work);
+ flush_workqueue(edac_workqueue);
}

/*
--
1.9.1