Re: [PATCH v1] perf record: Fix event fd races
From: Ian Rogers
Date: Mon Oct 24 2022 - 01:33:54 EST
On Sun, Oct 23, 2022 at 7:56 PM Leo Yan <leo.yan@xxxxxxxxxx> wrote:
>
> Hi Ian,
>
> On Sun, Oct 23, 2022 at 06:10:24PM -0700, Ian Rogers wrote:
> > The write call may set errno which is problematic if occurring in a
> > function also setting errno. Save and restore errno around the write
> > call.
> >
> > done_fd may be used after close, clear it as part of the close and
> > check its validity in the signal handler.
> >
> > Suggested-by: Greg Thelen <gthelen@xxxxxxxxxx>
> > Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx>
> > ---
> > tools/perf/builtin-record.c | 41 ++++++++++++++++++++++---------------
> > 1 file changed, 25 insertions(+), 16 deletions(-)
> >
> > diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> > index 52d254b1530c..e128b855ddde 100644
> > --- a/tools/perf/builtin-record.c
> > +++ b/tools/perf/builtin-record.c
> > @@ -649,7 +649,7 @@ static int record__pushfn(struct mmap *map, void *to, void *bf, size_t size)
> > static volatile int signr = -1;
> > static volatile int child_finished;
> > #ifdef HAVE_EVENTFD_SUPPORT
> > -static int done_fd = -1;
> > +static volatile int done_fd = -1;
>
> Here is a bit suspecious for adding volatile qualifier. See the
> document: process/volatile-considered-harmful.rst.
>
> I know the document is mainly for kernel programming, but seems to me
> it's also valid for C programming in userspace.
>
> I not sure what's the purpose for adding volatile for done_fd, if we
> really have concern for reading any stale value for done_fd, should we
> use WRITE_ONCE/READ_ONCE?
We could just switch to C11 and stdatomic. The volatile is consistent
with the code above and more consistent with the expectation of
writing to a variable that is read in a signal handler.
Thanks,
Ian
> The rest changes look good to me.
>
> Thanks,
> Leo
>
> > #endif
> >
> > static void sig_handler(int sig)
> > @@ -661,19 +661,24 @@ static void sig_handler(int sig)
> >
> > done = 1;
> > #ifdef HAVE_EVENTFD_SUPPORT
> > -{
> > - u64 tmp = 1;
> > - /*
> > - * It is possible for this signal handler to run after done is checked
> > - * in the main loop, but before the perf counter fds are polled. If this
> > - * happens, the poll() will continue to wait even though done is set,
> > - * and will only break out if either another signal is received, or the
> > - * counters are ready for read. To ensure the poll() doesn't sleep when
> > - * done is set, use an eventfd (done_fd) to wake up the poll().
> > - */
> > - if (write(done_fd, &tmp, sizeof(tmp)) < 0)
> > - pr_err("failed to signal wakeup fd, error: %m\n");
> > -}
> > + if (done_fd >= 0) {
> > + u64 tmp = 1;
> > + int orig_errno = errno;
> > +
> > + /*
> > + * It is possible for this signal handler to run after done is
> > + * checked in the main loop, but before the perf counter fds are
> > + * polled. If this happens, the poll() will continue to wait
> > + * even though done is set, and will only break out if either
> > + * another signal is received, or the counters are ready for
> > + * read. To ensure the poll() doesn't sleep when done is set,
> > + * use an eventfd (done_fd) to wake up the poll().
> > + */
> > + if (write(done_fd, &tmp, sizeof(tmp)) < 0)
> > + pr_err("failed to signal wakeup fd, error: %m\n");
> > +
> > + errno = orig_errno;
> > + }
> > #endif // HAVE_EVENTFD_SUPPORT
> > }
> >
> > @@ -2834,8 +2839,12 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
> >
> > out_delete_session:
> > #ifdef HAVE_EVENTFD_SUPPORT
> > - if (done_fd >= 0)
> > - close(done_fd);
> > + if (done_fd >= 0) {
> > + fd = done_fd;
> > + done_fd = -1;
> > +
> > + close(fd);
> > + }
> > #endif
> > zstd_fini(&session->zstd_data);
> > perf_session__delete(session);
> > --
> > 2.38.0.135.g90850a2211-goog
> >