Re: [PATCH] document optimizing macro for translating PROT_ to VM_bits

From: Maciej Zenczykowski
Date: Mon Sep 29 2003 - 12:38:04 EST


> > > +/* Optimisation macro, used to be defined as: */
> > > +/* ((bit1 == bit2) ? (x & bit1) : (x & bit1) ? bit2 : 0) */
> > > +/* but this version is faster */
> > > +/* "check if bit1 is on in 'x'. If it is, return bit2" */
> > > #define _calc_vm_trans(x,bit1,bit2) \
> > > ((bit1) <= (bit2) ? ((x) & (bit1)) * ((bit2) / (bit1)) \
> > > : ((x) & (bit1)) / ((bit1) / (bit2)))
> >
> > I agree with the intent of that comment, but the code in it is
> > unnecessarily complex. See if you like this, and if you do feel free
> > to submit it as a patch:
> >
> > /* Optimisation macro. It is equivalent to:
> > (x & bit1) ? bit2 : 0
> > but this version is faster. ("bit1" and "bit2" must be single bits). */
>
> Is this supposed to return the bitmask bit2, or (x & bit2)? If the former,
> then your code is right. If the latter, (x & bit1) ? (x & bit2) : 0
>
> I'm totally failing to see why the original did the bit1 == bit2 compare,
> so maybe mhyself and Jamie are both missing some subtlety?

I'd guess since the majority of the time bit1 and bit2 are compile time
constants and the comparison can be resolved during compilation into more
effective code in the bit1==bit2 case, thus effectively decreasing not
increasing the amount of generated object code.

That probably also explains why the convoluted long code with division
above is faster - the division and multiplication most likely gets
resolved during compile (since we're dealing with compile time constants)
into a single shift and results in an execution of bitop 'and' followed by
'shift left/right', which due to lacking and conditional branches
(necessary for ?:) is usually significantly faster due to a lack of
branch mispredictions.

MaZe.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/