Re: [tip:core/rcu] Revert "rcu: Decrease memory-barrier usagebased on semi-formal proof"

From: Paul E. McKenney
Date: Wed May 25 2011 - 16:48:59 EST


On Wed, May 25, 2011 at 09:24:06AM +0200, Ingo Molnar wrote:
>
> * Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
>
> > On Tue, May 24, 2011 at 05:13:06PM -0700, Yinghai Lu wrote:
> > > On 05/24/2011 05:05 PM, Paul E. McKenney wrote:
> > > > On Tue, May 24, 2011 at 02:23:45PM -0700, Yinghai Lu wrote:
> > > >> On 05/23/2011 06:35 PM, Paul E. McKenney wrote:
> > > >>> On Mon, May 23, 2011 at 06:26:23PM -0700, Yinghai Lu wrote:
> > > >>>> On 05/23/2011 06:18 PM, Paul E. McKenney wrote:
> > > >>>>
> > > >>>>> OK, so it looks like I need to get this out of the way in order to track
> > > >>>>> down the delays. Or does reverting PeterZ's patch get you a stable
> > > >>>>> system, but with the longish delays in memory_dev_init()? If the latter,
> > > >>>>> it might be more productive to handle the two problems separately.
> > > >>>>>
> > > >>>>> For whatever it is worth, I do see about 5% increase in grace-period
> > > >>>>> duration when switching to kthreads. This is acceptable -- your
> > > >>>>> 30x increase clearly is completely unacceptable and must be fixed.
> > > >>>>> Other than that, the main thing that affects grace period duration is
> > > >>>>> the setting of CONFIG_HZ -- the smaller the HZ value, the longer the
> > > >>>>> grace-period duration.
> > > >>>>
> > > >>>> for my 1024g system when memory hotadd is enabled in kernel config:
> > > >>>> 1. current linus tree + tip tree: memory_dev_init will take about 100s.
> > > >>>> 2. current linus tree + tip tree + your tree - Peterz patch:
> > > >>>> a. on fedora 14 gcc: will cost about 4s: like old times
> > > >>>> b. on opensuse 11.3 gcc: will cost about 10s.
> > > >>>
> > > >>> So some patch in my tree that is not yet in tip makes things better?
> > > >>>
> > > >>> If so, could you please see which one? Maybe that would give me a hint
> > > >>> that could make things better on opensuse 11.3 as well.
> > > >>
> > > >> today's tip:
> > > >>
> > > >> [ 31.795597] cpu_dev_init done
> > > >> [ 40.930202] memory_dev_init done
> > > >
> > > > One other question... What is memory_dev_init() doing to wait for so
> > > > many RCU grace periods? (Yes, I do need to fix the slowdowns in any
> > > > case, but I am curious.)
> > >
> > > looks like it register some in sysfs
> >
> > Use of synchronize_rcu() for unregistering would make sense, but
> > I don't understand why it is needed when registering.
>
> I guess writing a patch to remove it would be welcome by the sysfs folks - or
> some subtle reason would be pointed out (which reason could thus be added to
> the code in a comment).
>
> Understanding the nondeterminism of grace periods would be extremely nice
> though, there *are* workloads that use rcu syncs rather frequently, and we have
> probably regressed them.

Agreed, if I can help people speed up sysfs creation, that would be good,
but avoiding/fixing RCU grace-period performance regressions is also a
good thing.

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/