Re: [PATCH v6] fs: Improve eventpoll logging to stop indicting timerfd

From: Isaac Manjarres
Date: Tue Jul 09 2024 - 17:05:02 EST


On Thu, Jul 04, 2024 at 04:03:59PM +0200, Christian Brauner wrote:
> On Wed, Jul 03, 2024 at 02:43:14PM GMT, Isaac J. Manjarres wrote:
> > From: Manish Varma <varmam@xxxxxxxxxx>
> >
> > We'll often see aborted suspend operations that look like:
> >
> > PM: suspend entry 2024-07-03 15:55:15.372419634 UTC
> > PM: PM: Pending Wakeup Sources: [timerfd]
> > Abort: Pending Wakeup Sources: [timerfd]
> > PM: suspend exit 2024-07-03 15:55:15.445281857 UTC
> >
> > From this, it seems a timerfd caused the abort, but that can be
> > confusing, as timerfds don't create wakeup sources. However,
> > eventpoll can, and when it does, it names them after the underlying
> > file descriptor. Unfortunately, all the file descriptors are called
> > "[timerfd]", and a system may have many timerfds, so this isn't very
> > useful to debug what's going on to cause the suspend to abort.
> >
> > To improve this, change the way eventpoll wakeup sources are named:
> >
> > 1) The top-level per-process eventpoll wakeup source is now named
> > "epollN:P" (instead of just "eventpoll"), where N is a unique ID token,
> > and P is the PID of the creating process.
> >
> > 2) Individual eventpoll item wakeup sources are now named
> > "epollitemN:P.F", where N is a unique ID token, P is PID of the creating
> > process, and F is the name of the underlying file descriptor.
>
> Fyi, that PID is meaningless or even actively misleading in the face of
> pid namespaces. And since such wakeups seem to be registered in sysfs
> globally they are visible to all containers. That means a container will
> now see some timerfd wakeup source with a PID that might just accidently
> correspond to a process inside the container. Which in turn also means
Thanks for your feedback on this, Christian. With regards to this
scenario: would it be useful to use a namespace ID, along with the PID,
to uniquely identify the process? If not, do you have a suggestion for
this?

I understand that the proposed naming scheme has a chance of causing
collisions, however, it is still an improvement over the existing
naming scheme in terms of being able to attribute wakeups to a
particular application.

> you're leaking the info about the creating process into the container.
> IOW, if PID 1 ends up registering some wakeup source the container gets
> to know about it.
Is there a general security concern about this? If not, can you please
elaborate why this is a problem?

Thanks,
Isaac