Re: [RFC][PATCH 2/2] fsnotify: handle permission events without holding fsnotify_mark_srcu[0]

From: Amir Goldstein
Date: Mon Nov 14 2016 - 10:11:20 EST


On Mon, Nov 14, 2016 at 3:20 PM, Jan Kara <jack@xxxxxxx> wrote:
> On Mon 14-11-16 13:48:27, Amir Goldstein wrote:
>> Handling fanotify events does not entail dereferencing fsnotify_mark
>> beyond the point of fanotify_should_send_event().
>>
>> For the case of permission events, which may block indefinitely,
>> return -EAGAIN and then fsnotify() will call handle_event() again
>> without a reference to the mark.
>>
>> Without a reference to the mark, there is no need to call
>> handle_event() under fsnotify_mark_srcu[0] read side lock,
>> so we drop fsnotify_mark_srcu[0] while handling the event
>> and grab it back before continuing to the next mark.
>>
>> After this change, a blocking permission event will no longer
>> block closing of any file descriptors of 0 priority groups,
>> i.e: inotify and fanotify groups of class FAN_CLASS_NOTIF.
>>
>> Reported-by: Miklos Szeredi <miklos@xxxxxxxxxx>
>> Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx>
>
> Well, this has a similar problem as my attempt to fix the issue. The
> current mark can get removed from the mark list while waiting for userspace
> response. ->next pointer is still valid at that moment but ->next->pprev
> already points to mark preceding us (that's how rcu lists work). When
> ->next mark then gets removed from the list and destroyed (it may be
> protected by the second srcu), our ->next pointer points to freed memory.

Oh! I missed the fact that the SRCU does not protect mark->obj_list.
Can resurrecting mark->free_list buy us anything?
Perhaps keep the marks on obj_list without FLAG_ATTACHED
and then remove marks from both free_list and obj_list post srcu_synchronize()?
I think that removing mark->obj_list was optimization of good faith
and not because it really hurts the system's memory usage that much?

>
> Furthermore I don't like the scheme of ->handle_event returning -EAGAIN and
> then dropping the srcu lock - I'd rather have some helpers provided by the
> generic fsnotify code to drop srcu lock. That needs some propagation of
> information inside the ->handle_event and then the helper but that's IMO
> cleaner. Anyway, that is just a technical detail.
>

Yeh, that's just the minimal LOC implementation. I figured the implementation
may be rejected for more profound flaws. Although personally,
I do like the so called O_NONBLOCKING semantics here.
I debated with myself if I should use EAGAIN or EWOULDBLOCK
for sake of readability.

> I have some ideas how to fix up issues with my refcounting approach -
> refcount would pin marks not only in memory but also in lists but I have
> yet to see whether that works out sensibly (it would mean that dropping
> mark reference would then need to take group->mark_mutex and that may cause
> lock ordering issues).
>

It sounds like a chicken and egg problem, but I suppose you don't mean
the actual mark refcount, but a secondary "pinned refcount"?

Anyway, if you have something ready, my test setup is already in place..

Cheers,
Amir.