Re: [PATCH] qla2xxx: rewrite code to avoid hitting gcc bug 70646

From: James Bottomley
Date: Fri Apr 15 2016 - 17:25:17 EST


On Fri, 2016-04-15 at 23:09 +0200, Denys Vlasenko wrote:
> On 04/15/2016 09:05 PM, James Bottomley wrote:
> > On Fri, 2016-04-15 at 20:56 +0200, Denys Vlasenko wrote:
> > > On 04/15/2016 04:40 PM, James Bottomley wrote:
> > > > On Fri, 2016-04-15 at 12:36 +0200, Denys Vlasenko wrote:
> > > > > More info here:
> > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70646
> > > >
> > > > This bug is under investigation, so I'd rather not alter code
> > > > for a
> > > > gcc
> > > > bug until we know if we can supply options to fix it rather
> > > > than
> > > > changing code.
> > >
> > >
> > > Background. The bug exists in gcc for 2 years, but it is rather
> > > hard to trigger, so nobody noticed.
> >
> > We know this ... linux-scsi is on the cc for the other thread on
> > this.
> >
> > > Unfortunately for kernel, these two commits landed in Linus tree
> > > in March 16 and 17:
> > >
> > >
> > > On 04/13/2016 05:36 AM, Josh Poimboeuf wrote:
> > > > It occurs with the combination of the following two recent
> > > > commits:
> > > >
> > > > - bc27fb68aaad ("include/uapi/linux/byteorder, swab: force
> > > > inlining
> > > > of some byteswap operations")
> > > > - ef3fb2422ffe ("scsi: fc: use get/put_unaligned64 for wwn
> > > > access")
> > >
> > >
> > > and now *many* users of qla2x00 and new-ish gcc are going to
> > > very much notice it, as their kernels will start crashing
> > > reliably.
> > >
> > > The commits can be reverted, sure, but they per se do not contain
> > > anything unusual. They, together with not very typical construct
> > > in qla2x00_get_host_fabric_name, one
> > > which boils down to "swab64p(constant_array_of_8_bytes)",
> > > just happen to nudge gcc in a right way to finally trigger the
> > > bug.
> > >
> > > So I came with another idea how to forestall the imminent deluge
> > > of
> > > qla2x00 oops reports - this patch.
> >
> > There are actually a raft of checkers that run the upstream code
> > which
> > aren't seeing any problem; likely because the code is harder to
> > trigger
> > than you think. So, lets wait until the resolution of the other
> > thread
> > before we panic, especially since we're only at -rc3.
>
> I'm not panicking, James.
>
> By sending a workaround, I just want to make sure that *other people*
> won't be forced to fix up a problem which surfaced because of *my*
> patch.

Look, if gcc really proves to be intractable, I think what should
happen is revert the triggering patch, which is

commit e3bde9568d992c5f985e6e30731a5f9f9bef7b13
Author: Denys Vlasenko <dvlasenk@xxxxxxxxxx>
Date: Thu Mar 17 14:22:47 2016 -0700

include/linux/unaligned: force inlining of byteswap operations

But, as I've said a couple of times now, there are no bug reports from
the testers about qla2xxx (yet) so we can afford to wait a bit and see
if there's some other resolution that doesn't involve changing kernel
code to work around a local gcc bug.

James