Re: [RFC v1 0/5] fs/locks: Use plain percpu spinlocks instead of lglock to protect file_lock

From: Jeff Layton
Date: Mon Mar 02 2015 - 19:29:18 EST

Next message: Yu, Fenghua: "RE: xsaves support broken?"
Previous message: Kees Cook: "[PATCH 4/5] mm: split ET_DYN ASLR from mmap ASLR"
In reply to: Daniel Wagner: "Re: [RFC v1 0/5] fs/locks: Use plain percpu spinlocks instead of lglock to protect file_lock"
Next in thread: Daniel Wagner: "Re: [RFC v1 0/5] fs/locks: Use plain percpu spinlocks instead of lglock to protect file_lock"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Mon, 2 Mar 2015 13:58:17 +0100
Daniel Wagner <daniel.wagner@xxxxxxxxxxxx> wrote:

> On 02/27/2015 04:30 PM, Jeff Layton wrote:
> > On Fri, 27 Feb 2015 16:01:30 +0100
> > Daniel Wagner <daniel.wagner@xxxxxxxxxxxx> wrote:
> >> On 02/24/2015 10:06 PM, Jeff Layton wrote:
> >>> On Tue, 24 Feb 2015 16:58:26 +0100
> >>> Daniel Wagner <daniel.wagner@xxxxxxxxxxxx> wrote:
> >>>> On 02/20/2015 05:05 PM, Andi Kleen wrote:
> >>>>> Daniel Wagner <daniel.wagner@xxxxxxxxxxxx> writes:
> >>>>>>
> >>>>>> I am looking at how to get rid of lglock. Reason being -rt is not too
> >>>>>> happy with that lock, especially that it uses arch_spinlock_t and
> >>>>>
> >>>>> AFAIK it could just use normal spinlock. Have you tried that?
> >>>>
> >>>> I have tried it. At least fs/locks.c didn't blow up. The benchmark
> >>>> results (lockperf) indicated that using normal spinlocks is even
> >>>> slightly faster. Simply converting felt like cheating. It might be
> >>>> necessary for the other user (kernel/stop_machine.c). Currently it looks
> >>>> like there is some additional benefit getting lglock away in fs/locks.c.
> >>>>
> >>>
> >>> What would that benefit be?
> >>>
> >>> lglocks are basically percpu spinlocks. Fixing some underlying
> >>> infrastructure that provides that seems like it might be a better
> >>> approach than declaring them "manually" and avoiding them altogether.
> >>>
> >>> Note that you can still do basically what you're proposing here with
> >>> lglocks as well. Avoid using lg_global_* and just lock each one in
> >>> turn.
> >>
> >> Yes, that was I was referring to as benefit. My main point is that there
> >> are only lg_local_* calls we could as well use normal spinlocks. No need
> >> to fancy.
> >>
> >
> > Sure, but the lg_lock wrappers are a nice abstraction for this. I don't
> > think that we gain much by eliminating them. Changing the lglock code
> > to use normal spinlocks would also have the benefit of fixing up the
> > other user of that code.
>
> Obviously, you only need lglock if you take all locks at once. As point
> out, accessing /proc/locks is not something what happens very often. My
> hope was to get a bigger box in time to measure how expensive such an
> operation could get if many cores are involved. On my small system,
> there is real gain or loss by this change.
>
> >>> That said, now that I've thought about this, I'm not sure that's really
> >>> something we want to do when accessing /proc/locks. If you lock each
> >>> one in turn, then you aren't freezing the state of the file_lock_list
> >>> percpu lists. Won't that mean that you aren't necessarily getting a
> >>> consistent view of the locks on those lists when you cat /proc/locks?
> >>
> >> Maybe I am overlooking something here but I don't see a consistency
> >> problem. We list a blocker and all its waiter in a go since only the
> >> blocker is added to flock_lock_list and the waiters are added blocker's
> >> fl_block list.
> >>
> >
> > Other locking activity could be happening at the same time. For
> > instance, between when you drop one CPU's spinlock and pick up another,
> > the lock that you just printed could be acquired by another thread on
> > another CPU and then you go print it again. Now you're showing
> > conflicting locks in /proc/locks output.
>
> Hmm, are you sure about that? I read the code this way that when a lock
> is added to flock_list it stays on that CPU. The locks are not moved
> from one flock_list to another during their existent.
>

Yes, I'm sure. When a file lock is acquired, we assign the fl_link_cpu
to the current CPU and add it to the current CPU's global list. When
the lock is released, any blocked lock that might have been blocking on
it could acquire it at that point, and that doesn't necessarily occur
on the same CPU as where the lock was originally held.

So, it's entirely possible that between when you drop the spinlock on
one CPU and pick it up on another, the lock could have been released
and then reacquired on a different CPU.

> > Is that a real problem? I've no idea -- we don't have a lot of guidance
> > for what sort of atomicity /proc/locks needs, but that seems wrong to
> > me.
>
> During the timeframe when taking all locks are taken, locks can be
> created or destroyed on those CPU which spinlock is not taken yet. I
> don't know if I would even use the word 'atomicity' here since any short
> lived process is likely to be missed.
>

But they can't be added to or removed from the list. The fact that all
of the percpu locks are held prevents that.

> Even a busy loop reading /proc/locks is likely to missing the flock02
> processes for example.
>
> My point is that you do not really gain anything from taking all locks
> before iterating over the percpu flock_list vs locking/unlocking while
> iterating.
>

Yes, you do. You gain consistency. The info presented will represent
the state of the file locks on the system at a particular point in
time. If you take the locks one at a time, that's not necessarily the
case.

> > I also just don't see much benefit in optimizing /proc/locks access.
>
> FWIW we could avoid taking a non scaling lock.
>

In this case (as in most others) I think correctness trumps
performance. /proc/locks would be pretty worthless if we couldn't count
on it presenting consistent information about the state of file locks.

> > That's only done very rarely under most workloads. Locking all of
> > the spinlocks when you want to read from it sounds 100% fine to me
> > and that may help prevent these sorts of consistency problems.
>
> If they exists :)
>
> > It also has the benefit of keeping the /proc/locks seqfile code
> > simpler.
>
> The resulting code is almost the same. The locking part is in hidden
> in seq_hlist_start_percpu_locked() and friends.
>
> >>> I think having a consistent view there might trump any benefit to
> >>> performance. Reading /proc/locks is a *very* rare activity in the
> >>> big scheme of things.
> >>
> >> I agree, but I hope that I got it right with my consistency
> >> argument than there shouldn't be a problem.
> >>
> >>> I do however like the idea of moving more to be protected by the
> >>> lglocks, and minimizing usage of the blocked_lock_lock.
> >>
> >> Good to hear. I am trying to write a new test (a variation of the
> >> dinning philosophers 'problem') case which benchmarks
> >> blocked_lock_lock after the re-factoring.
> >>
> >
> > Sounds good. I may go ahead and pick up the first couple of patches
> > and queue them for v4.1 since they seem like reasonable cleanups.
> > I'll let you know once I've done that.
>
> Great.
>
> cheers,
> daniel

--
Jeff Layton <jlayton@xxxxxxxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Yu, Fenghua: "RE: xsaves support broken?"
Previous message: Kees Cook: "[PATCH 4/5] mm: split ET_DYN ASLR from mmap ASLR"
In reply to: Daniel Wagner: "Re: [RFC v1 0/5] fs/locks: Use plain percpu spinlocks instead of lglock to protect file_lock"
Next in thread: Daniel Wagner: "Re: [RFC v1 0/5] fs/locks: Use plain percpu spinlocks instead of lglock to protect file_lock"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]