Re: [PATCH 1b/7] dlm: core locking

From: Daniel Phillips
Date: Sat Apr 30 2005 - 06:14:17 EST


On Saturday 30 April 2005 06:32, Lars Marowsky-Bree wrote:
> > > If it is, what prevents GFS from getting a DLM lock granted and writing
> > > to the shared storage before the node that previously had it is fenced?
> >
> > In my opinion, using the dlm to protect the shared storage resource
> > constitutes tackling the problem far too high up on the food chain.
>
> Au contraire. Something pretty high up the food chain needs to decide
> resource ownership / access permissions.

You confused "high up the food chain" with "being intelligent". Nothing
prevents your oldest cluster node from making complex decisions, or calling
into play other services in order to make those decision. It could, for
example, invoke some server on some other node to make the actual decision.

The important thing is to always have an easy _starting point_ for making
decisions. I do not think that dlm-based algorithms are suitable in that
regard, because of the unavoidable code complexity just to kick off the
process. And obviously, there already is some reliable starting point or
cman would not work. So let's just expose that and have a better cluster
stack.

> > > PS if an application is writing to local storage, what does it need
> > > a DLM for?
> >
> > Good instinct. In fact, as I've said before, you don't necessarily
> > need a dlm in a cluster application at all. What you need is _global
> > synchronization_, however that is accomplished. For example, I have
> > found it simpler and more efficient to use network messaging for the
> > cluster applications I've tackled so far. This suggests to me that
> > the dlm is going to end up pretty much as a service needed only by a
> > cfs, and not much else. The corollary of that is, we should
> > concentrate on making the dlm work well for the cfs, and not get too
> > wrapped up in trying to make it solve every global synchronization
> > problem in the world.
>
> This isn't quite true. It _is_ true that essentially every locking /
> coordination mechanism can be mapped to every other one: locks can be
> implemented via barriers, barriers via locks, locks via messages, global
> ordering via barriers, voting via locks, locks via voting, and then
> there's all the different possible kinds of locks... And internally,
> they might well share certain code paths. (Ie, a DLM electing a
> transition master for lockspace recovery internally via voting, because
> obviously _it_ can't use itself to pull itself out of the mess ;-)

But note that it _can_ use the oldest cluster member as a recovery master, or
to designate a recovery master. It can, and should - there is no excuse for
making this any more complex than it needs to be.

> However, for most problems, there's one approach which makes the problem
> you're dealing with much easier to express, and potentially more
> efficient.

True. My point is that the sweet spot for cluster synchronization is, in my
experience, usually _not_ the dlm.

> A good cluster suite should offer support for barriers, locking, voting
> and (group) messaging with extended virtual synchrony, so that the
> problems can be addressed correctly. Compare how the kernel offers more
> than one form for synchronization on SMP systems: semaphores,
> spinlocks, RCU, atomic operations...

All agreed, except that voting and group messaging should be provided as
separate services in the style of plugins, on which the base cluster
infrastructure does not depend.

OK, once we have found a way to plug in voting and whiz-bang messaging, I
think we have discovered the mythical "vcs", the virtual cluster switch. It
doesn't need to be in-kernel though.

Regards,

Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/