Re: [PATCH v3 1/3] lib: find_*_bit reimplementation

From: George Spelvin
Date: Thu Feb 12 2015 - 03:15:53 EST


> Rasmus, your version has ANDing by mask, and resetting the mask at each iteration
> of main loop. I think we can avoid it. What do you think on next?

Yes, that's basically what I proposed (modulo checking for zero size and
my buggy LAST_WORD_MASK).

But two unconditional instructions in the loop are awfully minor; it's
loads and conditional branches that cost.

The reset of the mask can be done in parallel with other operations; it's
only the AND that actually takes a cycle.

I can definitely see the argument that, for code that's not used often
enough to stay resident in the L1 cache, any speedup has to win by at
least one L2 cache access to be worth taking another cache line.

For Ivy bridge, those numbers are 32 KB and 12 cycles.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/