Re: fanotify - overall design before I start sending patches

From: Tvrtko Ursulin
Date: Thu Aug 06 2009 - 07:00:02 EST


On Thursday 06 August 2009 11:29:08 Peter Zijlstra wrote:
> On Thu, 2009-08-06 at 11:20 +0100, Douglas Leeder wrote:
> > Pavel Machek wrote:
> > > On Wed 2009-08-05 17:46:16, Tvrtko Ursulin wrote:
> > >> On Wednesday 05 August 2009 03:05:34 Pavel Machek wrote:
> > >>
> > >> Just to make sure you haven't missed this - it is not that they have
> > >> to complete the whole operation before the timeout period (since you
> > >> mention realtime/mlock I suspect this is what you think?), but
> > >> _during_ the operation they have to show that they are active by
> > >> sending something like keep alive messages.
> > >>
> > >> Or you are worried about failing to meet even that on a loaded system?
> > >> There has to be something like this otherwise hung userspace client
> > >> would kill the whole system.
> > >
> > > Of course, I'm worried about failing to meet this on loaded
> > > system. And the fact that I _have_ to worry about that means that
> > > interface is ugly/broken.
> >
> > You mean that in 5 seconds, you won't have any point when you can tell
> > the kernel, "I'm still working"?
>
> I have to agree with Pavel here, either you demand the monitor process
> is RT/mlock and can respond in time, in which case the interface doesn't
> need a 5 second timeout, or you cannot and you have a hole somewhere.
>
> Now having the kernel depend on any user task to guarantee process is of
> course utterly insane too.
>
> Sounds like a bad place to be, and I'd rather not have it.
>
> If you really need the intermediate you might as well use a FUSE
> filesystem, but I suspect there's plenty of problems there as well.

So you mount FUSE on top of everything if you want to have systemwide
monitoring and then you _again_ depend on _userspace_, no? By this logic
everything has to be in kernel. But even if it was, and the CPUs are so
overloaded that an userspace thread does not get to run at all for X seconds,
are kernel threads scheduled differently eg. with priority other than nice
levels?

Also, it is not like that when the timeout expires the kernel will hang.
Rather, some application would get an error from open(2). Note how that is by
system configuration where the admin has made a _deliberate_ decision to
install such software which can cause this behaviour.

You can have a RT/mlocked client but what if it crashes (lets say busy loops)?
Which is also something timeout mechanism is guarding against.

I really think if we want to have this functionality there is no way around
the fact that any userspace can fail. Kernel should handle it of course, and
Eric's design does it by kicking repeatedly misbehaving clients out.

If the timeout is made configurable I think this is the best that can be done
here. I don't think the problem is so huge as you are presenting it.

Tvrtko
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/