Re: [PATCH 10/18] fsinfo: Provide notification overrun handling support [ver #21]

From: David Howells
Date: Wed Aug 05 2020 - 16:15:15 EST

Miklos Szeredi <miklos@xxxxxxxxxx> wrote:

> Shoun't we just make sure that the likelyhood of overruns is low

That's not necessarily easy. To avoid overruns you need a bigger buffer. The
buffer is preallocated from unswappable kernel space. Yes, you can increase
the size of the buffer, but it eats out of your pipe bufferage limit.

Further, it's a *general* notifications queue, not just for a specific
purpose, but that means it might get connected to multiple sources, and doing
something like tearing down a container might generate enough notifications to
overrun the queue.

> and if it happens, just reinitialize everthing from scratch (shouldn't be
> *that* expensive).

If you then spend time reinitialising everything, you're increasing the
likelihood of racing with further events. Further, there multiple expenses:
firstly, you have to tear down and discard all the data that you've spent time
setting up; secondly, it takes time doing all this; thirdly, it takes cpu
cycles away from applications.

The reason I put the event counters in there and made it so that fsinfo()
could read all the mounts in a subtree and their event counters in one go is
to make it faster for the user to find out what changed in the event that a
notification is lost.

I have a patch (not included here as it occasionally induces oopses) that
attempts to make this doable under the RCU read lock so that it doesn't
prevent mounts from taking place during the scan.