Re: [PATCH] gfs2: Revert "Fix loop in gfs2_rbm_find"

From: Linus Torvalds
Date: Thu Jan 31 2019 - 13:41:20 EST


On Wed, Jan 30, 2019 at 12:30 PM Andreas Gruenbacher
<agruenba@xxxxxxxxxx> wrote:
>
> This reverts commit 2d29f6b96d8f80322ed2dd895bca590491c38d34.
>
> It turns out that the fix can lead to a ~20 percent performance regression
> in initial writes to the page cache according to iozone. Let's revert this
> for now to have more time for a proper fix.

Ugh. I think part of the problem is that the

n += (rbm->bii - initial_bii);

doesn't make any sense when we just set rbm->bii to 0 a couple of
lines earlier. So it's basically a really odd way to write

n -= initial_bii;

which in turn really doesn't make any sense _either_.

So I'l all for reverting the commit because it caused a performance
regression, but the end result really is all kinds of odd.

Is the bug as simple as "we incremented the iterator counter twice"?
Because in the -E2BIG case, we first increment it by the difference in
bii, but then we *also* increment it in res_covered_end_of_rgrp()
(which we do *not* do for the "ret > 0" case, which goes to
next_iter).

So if somebody can run the performance test again, how does it look if
*instead* of the revert, we do what looks at least _slightly_ more
sensible, and move the "increment iteration count" up to the
next-bitmap case instead?

At that point, it will actually match the bii updates (because
next_bitmap also updates rbm->bii). Hmm?

Something like the whitespace-damaged thinig below. Completely
untested. But *if* this works, it would make more sense than the
revert..

Hmm?

Linus

--- snip snip ---
diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
index 831d7cb5a49c..5b1006d5344f 100644
--- a/fs/gfs2/rgrp.c
+++ b/fs/gfs2/rgrp.c
@@ -1796,10 +1796,10 @@ static int gfs2_rbm_find(struct gfs2_rbm
*rbm, u8 state, u32 *minext,
rbm->bii++;
if (rbm->bii == rbm->rgd->rd_length)
rbm->bii = 0;
+ n++;
res_covered_end_of_rgrp:
if ((rbm->bii == 0) && nowrap)
break;
- n++;
next_iter:
if (n >= iters)
break;