Re: [PATCH v2 1/6] locks: fix unlock when fcntl_setlk races with a close
From: J. Bruce Fields
Date: Fri Jan 08 2016 - 11:22:13 EST
On Fri, Jan 08, 2016 at 11:21:01AM -0500, J. Bruce Fields wrote:
> On Fri, Jan 08, 2016 at 11:11:54AM -0500, Jeff Layton wrote:
> > On Fri, 8 Jan 2016 10:55:33 -0500
> > "J. Bruce Fields" <bfields@xxxxxxxxxxxx> wrote:
> >
> > > On Fri, Jan 08, 2016 at 08:50:09AM -0500, Jeff Layton wrote:
> > > > Dmitry reported that he was able to reproduce the WARN_ON_ONCE that
> > > > fires in locks_free_lock_context when the flc_posix list isn't empty.
> > > >
> > > > The problem turns out to be that we're basically rebuilding the
> > > > file_lock from scratch in fcntl_setlk when we discover that the setlk
> > > > has raced with a close. If the l_whence field is SEEK_CUR or SEEK_END,
> > > > then we may end up with fl_start and fl_end values that differ from
> > > > when the lock was initially set, if the file position or length of the
> > > > file has changed in the interim.
> > > >
> > > > Fix this by just reusing the same lock request structure, and simply
> > > > override fl_type value with F_UNLCK as appropriate. That ensures that
> > > > we really are unlocking the lock that was initially set.
> > >
> > > You could also just do a whole-file unlock, couldn't you? That would
> > > seem less confusing to me. But maybe I'm missing something.
> > >
> > > --b.
> > >
> >
> > I considered that too...but I was thinking that might make things even
> > worse. Consider:
> >
> > Thread1 Thread2
> > ----------------------------------------------------------------------------
> > fd1 = open(...);
> > fd2 = dup(fd1);
> > fcntl(fd2, F_SETLK);
> > (Here we call fcntl, and lock is set, but
> > task gets scheduled out before fcheck)
> > close(fd2)
> > fcntl(fd1, F_SETLK...);
> > Task scheduled back in, does fcheck for fd2
> > and finds that it's gone. Removes the lock
> > that Thread1 just set.
> >
> > If we just unlock the range that was set then Thread1 won't be affected
> > if his lock doesn't overlap Thread2's.
> >
> > Is that better or worse? :)
> >
> > TBH, I guess all of this is somewhat academic. If you're playing with
> > traditional POSIX locks and threads like this, then you really are
> > playing with fire.
> >
> > We should try to fix that if we can though...
>
> Yeah. I almost think an OK iterim solution would be just to document
> the race in the appropriate man page and tell people that if they really
> want to use posix locks in an application with lots of threads sharing
> file descriptors then they should consider OFD locks.
(Especially if this race has always existed.)
--b.