Re: RFC: remove __read_mostly
From: Andi Kleen
Date: Mon Dec 17 2007 - 07:20:41 EST
On Mon, Dec 17, 2007 at 03:07:43AM -0800, Andrew Morton wrote:
> On Mon, 17 Dec 2007 11:53:36 +0100 Eric Dumazet <dada1@xxxxxxxxxxxxx> wrote:
>
> > n Mon, 17 Dec 2007 02:33:39 -0800
> > Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > > On Fri, 14 Dec 2007 01:33:45 +0100 Andi Kleen <andi@xxxxxxxxxxxxxx> wrote:
> > >
> > > > Kyle McMartin <kyle@xxxxxxxxxxx> writes:
> > > >
> > > > > I'd bet, in the __read_mostly case at least, that there's no
> > > > > improvement in almost all cases.
> > > >
> > > > I bet you're wrong. Cache line behaviour is critical, much more
> > > > than pipeline behaviour (which unlikely affects). That is because
> > > > if you eat a cache miss it gets really expensive, which e.g.
> > > > a mispredicted jump is relatively cheap in comparison. We're talking
> > > > one or more orders of magnitude.
> > >
> > > So... once we've moved all read-mostly variables into __read_mostly, what
> > > is left behind in bss?
> > >
> > > All the write-often variables. All optimally packed together to nicely
> > > maximise cacheline sharing.
> >
> > This is why it's important to group related variables together, so that they share
> > same cacheline.
>
> Not feasible. The linker is (I believe) free to place them anywhere it
> likes unless we go and aggregate them in a struct.
It won't normally though (they are put linear for each object file)
and if you really want to make sure you can always use a special section.
> The insidious thing about this is that is is highly dependent upon
> compiler/linker version and upon kernel config. So performance differences
I'm not aware of the order of global variables changing that much.
Especially the linker seems to keep it rather stable.
> end up with a better result: all those read-mostly, read-rarely variables (and
> there are a lot of those) could be very usefully deployed by packing them
> in between the write-often variables.
>
> It's crying out for a performance-guided solution.
My problem with profile feedback is that it will make it impossible
to ever recreate kernel binaries after the fact. So decoding of random
oopses would become much harder.
-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/