Re: [ltt-dev] [RFC git tree] Userspace RCU (urcu) for Linux(repost)

From: Mathieu Desnoyers
Date: Mon Feb 09 2009 - 14:15:45 EST


* Mathieu Desnoyers (compudj@xxxxxxxxxxxxxxxxxx) wrote:
> * Paul E. McKenney (paulmck@xxxxxxxxxxxxxxxxxx) wrote:
> > On Mon, Feb 09, 2009 at 10:37:42AM -0800, Paul E. McKenney wrote:
> > > On Mon, Feb 09, 2009 at 01:13:41PM -0500, Mathieu Desnoyers wrote:
> > > > * Paul E. McKenney (paulmck@xxxxxxxxxxxxxxxxxx) wrote:
> >
> > [ . . . ]
> >
> > > > You know what ? Changing RCU_GP_CTR_BIT to 16 uses a
> > > > testw %ax, %ax instead of a testb %al, %al. The trick here is that
> > > > RCU_GP_CTR_BIT must be a multiple of 8 so we can use a full 8-bits,
> > > > 16-bits or 32-bits bitmask for the lower order bits.
> > > >
> > > > On 64-bits, using a RCU_GP_CTR_BIT of 32 is also ok. It uses a testl.
> > > >
> > > > To provide 32-bits compability and allow the deepest nesting possible, I
> > > > think it makes sense to use
> > > >
> > > > /* Use the amount of bits equal to half of the architecture long size */
> > > > #define RCU_GP_CTR_BIT (sizeof(long) << 2)
> > >
> > > You lost me on this one:
> > >
> > > sizeof(long) << 2 = 0x10
> > >
> > > I could believe the following (run on a 32-bit machine):
> > >
> > > 1 << (sizeof(long) * 8 - 1) = 0x80000000
> > >
> > > Or, if you were wanting to use a bit halfway up the word, perhaps this:
> > >
> > > 1 << (sizeof(long) * 4 - 1) = 0x8000
> > >
> > > Or am I confused?
> >
> > Well, I am at least partly confused. You were wanting a low-order bit,
> > so you want to lose the "- 1" above. Here are some of the possibilities:
> >
> > sizeof(long) = 0x4
> > sizeof(long) << 2 = 0x10
> > 1 << (sizeof(long) * 8 - 1) = 0x80000000
> > 1 << (sizeof(long) * 4) = 0x10000
> > 1 << (sizeof(long) * 4 - 1) = 0x8000
> > 1 << (sizeof(long) * 2) = 0x100
> > 1 << (sizeof(long) * 2 - 1) = 0x80
> >
> > My guess is that 1 << (sizeof(long) * 4) and 1 << (sizeof(long) * 2)
> > are of the most interest.
> >
>
> Exactly. I'll change it to :
>
> #define RCU_GP_CTR_BIT (1 << (sizeof(long) << 2))
>
> I somehow thought this define was used as a bit number rather than the
> bit mask.
>
> Thanks,
>
> Mathieu
>

It's pushed in the git tree. I also removed an increment in the fast
path by initializing urcu_gp_ctr to RCU_GP_COUNT.

It brings benchmarks to :

Time per read : 6.87183 to 7.25318 cycles

So we seem to save about half a cycle to a cycle with this.

Mathieu


>
>
> > Thanx, Paul
> >
>
> --
> Mathieu Desnoyers
> OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
>
> _______________________________________________
> ltt-dev mailing list
> ltt-dev@xxxxxxxxxxxxxxxxxxxxx
> http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
>

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/