Re: [PATCH 2/2] perf/tools: put new buildid locks to use

From: Milos Vyletel
Date: Wed Jun 03 2015 - 07:27:27 EST


On Wed, Jun 03, 2015 at 01:21:41PM +0200, Ingo Molnar wrote:

>
> * Milos Vyletel <milos@xxxxxxxxxx> wrote:
>
> > On Thu, May 14, 2015 at 07:38:08PM +0200, Ingo Molnar wrote:
> >
> > >
> > > * Milos Vyletel <milos@xxxxxxxxxx> wrote:
> > >
> > > > On Thu, May 14, 2015 at 01:38:21PM +0200, Jiri Olsa wrote:
> > > >
> > > > > On Thu, May 14, 2015 at 12:40:59PM +0200, Ingo Molnar wrote:
> > > > > >
> > > > > > * Milos Vyletel <milos@xxxxxxxxxx> wrote:
> > > > > >
> > > > > > > Use new read/write locks when accesing buildid directory on places where
> > > > > > > we may race if multiple instances are run simultaneously.
> > > > > >
> > > > > > Dunno, this will create locking interaction between multiple instances
> > > > > > of perf - hanging each other, etc.
> > > > > >
> > > > > > And it seems unnecessary: the buildid hierarchy is already spread out.
> > > > > > What kind of races might there be?
> > > > >
> > > > > there was just recently one fixed by commit:
> > > > > 0635b0f71424 perf tools: Fix race in build_id_cache__add_s()
> > > > >
> > > > > havent checked the final patch yet, but the idea is to
> > > > > protect us from similar bugs
> > > >
> > > > right. on top of race with EEXIST couple more are possible (EMLINK,
> > > > ENOSPC, EDQUOT, ENOMEM... the only way to prevent them all is to
> > > > lock this kind of operations and make sure we run one at a time.
> > >
> > > Yeah, so the race pointed out in 0635b0f71424 can be (and should be)
> > > fixed without locking:
> > >
> > > - first create the file under a process-private name under
> > > ~/.debug/tmp/ if the target does not exist yet
> > >
> > > - then fully fill it in with content
> > >
> > > - then link(2) it to the public target name, which VFS operation is
> > > atomic and may fail safely: at which point it got already created
> > > by someone else.
> > >
> > > - finally unlink() the private instance name and the target will now
> > > be the only instance left: either created by us, or by some other
> > > perf instance in the rare racy case.
> > >
> > > Since all of ~/.debug is on the same filesystem this should work fine.
> > >
> > > Beyond avoiding locking this approach has another advantage: it's
> > > transaction safe, so a crashed/interrupted perf instance won't corrupt
> > > the debug database, it will only put fully constructed files into the
> > > public build-id namespace. It at most leaves a stale private file
> > > around in ~/.debug/tmp/.
> > >
> >
> > Ingo,
> >
> > I finally found some time to make this change. While going over the code I've
> > noticed one thing that would make concurrent creation even easier to solve.
> > Instead of copying the file to temp file what about simply opening file with
> > O_CREAT|O_EXCL? creat itself
> >
> > "creat() is equivalent to open() with flags equal to O_CREAT|O_WRONLY|O_TRUNC."
> >
> > addition of O_EXCL would
> >
> > "Ensure that this call creates the file: if this flag is specified in
> > conjunction with O_CREAT, and pathname already exists, then open() will fail."
> >
> > This we would prevent truncation of already linked file in case link() races as
> > in 0635b0f71424. What do you think?
>
> But it would not prevent the problem of creating a not yet fully constructed file
> - which some other tool invocation could attempt to parse in an incomplete
> fashion.
>
> Using create+link+unlink avoids that race, the files in the publicly visible
> namespace will always be fully constructed by the time they are made visible
> (atomically).
>

Got it. Will use the approach proposed by you.

Milos
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/