Re: [PATCH 04/10] perf record: Filter out POLLHUP'ed file descriptors

From: Arnaldo Carvalho de Melo
Date: Mon Sep 08 2014 - 10:33:43 EST


Em Mon, Sep 08, 2014 at 04:04:54PM +0200, Jiri Olsa escreveu:
> On Mon, Sep 08, 2014 at 10:46:16AM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Sat, Sep 06, 2014 at 10:39:15PM +0200, Jiri Olsa escreveu:
> > > On Fri, Sep 05, 2014 at 11:07:56AM -0300, Arnaldo Carvalho de Melo wrote:
> > > > Em Fri, Sep 05, 2014 at 11:42:59AM +0300, Adrian Hunter escreveu:
> > > > > On 09/04/2014 06:19 PM, Arnaldo Carvalho de Melo wrote:
> > > > > > Em Thu, Sep 04, 2014 at 03:32:08PM +0300, Adrian Hunter escreveu:
> > > > > No I was meaning something different. For example, 'perf record' opens an
> > > > > event for 2 processes per-cpu and gets 4 file descriptors:

> > > > > task1 task2
> > > > > cpu0 fd0 fd1
> > > > > cpu1 fd2 fd3

> > > > > Now, perf record will mmap fd0 and fd2 and set-output fd1->fd0
> > > > > and fd3->fd2.

> > > > > pollfds includes only fd0 and fd2.

> > > > > But if task2 exits, the POLLHUP will appear on fd1 and fd3.

> > > > So? We are not interested in fd1 and fd3, since all our reading is done
> > > > on fd0 and fd2 mmaps, no?

> > > hm, what if task1 (fd0, fd2) exits first.. perf record will exit,
> > > but it still has to read task2..?

> > Ok, what happens in that case, i.e. when the fds that were set to be the
> > ones to be polled, gets nuked, does the set-output command gets just
> > undone? Or does the mmap stands, receiving the events from the remaining
> > fds and the polling notifications get sent to, in this case, fd3 and
> > fd1?

> mmaps stays for fd1 and fd3.. and they get poll notifications as well,
> we just do not check/poll them now

So what you're saying is that we should have been polling all the fds
all the time?

Because after all we end up trying to consume everything in all the ring
buffers when just one of them gets a POLLRD anyway...

> > I'll look at the kernel code for that...
> >
> > > > I.e. when we ask the kernel to point fd B to fd A's mmap (what you
> > > > called set-output) and fd B inserts an event into fd A's mmap ring
> > > > buffer, we get fd A poll return as POLLRD, no?
> >
> > > > Have to check... Otherwise we would have to poll all fds all the time,
> > > > not just the ones mmaping, right?
> >
> > > > > I think Jiri's patchset changed pollfds to include all fds for that reason.
> >
> > > hm, I did not think of that.. ;-) I needed more grained feedback
> > > for future features like cpu hotplug
> >
> > So this is good for something you didn't tried to fix (and document) but
> > good for something that may be nice in the future? Grumpf, we have
> > already way too much stuff that will be eventually used but is not used
> > right now :-\

> IMO it's more clear to poll pm all event FDs.. and now with the
> case Adrian described it seems necessary anyway

I would have to check why was that we were polling just the one where
the mmap is done, I don't recall being the one to do it, probably who
did it thought that since the ring buffer is there, it was enough (and
possibly scaled better, dunno) to do the polling in just one of them.

- Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/