Re: Workqueue change with commit id,1bd04bf6f breaks mpt3sas scsi driver

From: Thomas Gleixner
Date: Fri Mar 25 2016 - 04:33:43 EST


On Thu, 24 Mar 2016, Tejun Heo wrote:
> On Wed, Mar 23, 2016 at 10:28:16PM -0400, nick wrote:
> > Seems that commit id, 1bd04bf6f breaks the mpt3sas driver according to this bug report:
> > https://bugzilla.kernel.org/show_bug.cgi?id=114611. Seems that the driver is fine as
> > have inspection all the driver functions are wrappers around queue_delayed_work and
> > according to the person's debugging this commit breaks it. However they are not sure
> > if it's the driver or the timer subsystem. I am assuming it's the timer subsystem as
> > the driver is just using wrapper functions around core workqueue functions with irqs
> > disabled and a spin lock held.
>
> Hmmm... 1bd04bf6f68d ("timer: Remove FIFO "guarantee"") doesn't look
> like an easy change to undo and it sounds like the driver was already
> (subtly) broken even before the commit given that ordered workqueues
> are not affine to any CPU and timer expirations across different CPUs
> aren't strictly ordered. Unfortunately, the only way forward seems to
> be implementing ordering from the driver's side.

The changelog explains in detail, that the FIFO "guarantee" did not exist for
a long time. Certainly that commit removed the last reminders of that so
called guarantee, but anything relying on FIFO ordering of the timer wheel was
broken before that commit. Just because it "worked" before that commit does
not mean it was correct.

Thanks,

tglx