Re: [PATCH 3.10 090/180] xfs: xfs_iflush_cluster fails to abort on error
From: Dave Chinner
Date: Mon Aug 22 2016 - 06:55:47 EST
On Mon, Aug 22, 2016 at 07:18:26AM +0200, Willy Tarreau wrote:
> Hi Dave,
>
> On Mon, Aug 22, 2016 at 02:21:08PM +1000, Dave Chinner wrote:
> > > - if (error || !bp) {
> > > + if (error == -EAGAIN) {
> >
> > Wrong. Errors changed sign in XFS in 3.17.
>
> Ah my bad, sorry for this.
>
> > /rant
> >
> > So, after just having to point this out (again!) for a different
> > stable kernel patchset review, and this specific problem causing
> > user-reported stable kernel regression and filesystem corruption
> > *months ago*. That resulted in discussion and new stable commits to
> > fix the problem. So now I'm left to wonder about the process of
> > stable kernels.
>
> Yep I remember this discussion now, I'm sorry.
>
> > AFAICT, stable kernel maintainers are not watching what happens with
> > other stable kernels, nor are they talking to other stable kernel
> > maintainers. I should not have to tell every single stable kernel
> > maintainer that a specific patch needs to be changed after it's
> > already been reported broken, triaged and fixed in other stable
> > kernels. You've all got a record that the patch needs to be included
> > in a stable kernel, but nobody is seems to notice when it comes to
> > fixing problems with a stable patch even when that all happens on
> > stable@xxxxxxxxxxxxxxxx
> >
> > Seriously, guys, pick up your act a bit and start talking between
> > yourselvesi and tracking regressions and fixes so the burden of
> > catching known reported and fixed problems with backports doesn't
> > rely on the upstream developers noticing the problem when hundreds
> > of patches for random stable kernels go past on lkml every week...
>
> We definitely do exchange quite a bit and I pick patches from 3.14 for
> 3.10, but sometimes I can simply pick the original one for various
> reasons (eg: I if had queued its upstream ID earlier). That's also why
> the review process helps. I'm sincerely sorry that I failed on this one
> and that you had to deal with it again, I'm going to fix it now.
Ok, I didn't notice that the fix from 3.14 was further down the
queue. I put a procmail filter in to catch this patch on lkml
so i didn't see it in the context of the entire series (way too much
traffic on lkml to keep up with it). So I probably pulled the
trigger a little early.
I agreed that it would be best to combine the two patches so there
isn't a bisection point that could result in corruptions...
Cheers,
Dave.
--
Dave Chinner
dchinner@xxxxxxxxxx