[PATCHv8.2] fanotify: enable close-on-exec on events' fd when requested in fanotify_init()

From: Yann Droneaud
Date: Fri Oct 03 2014 - 04:44:23 EST


According to commit 80af258867648 ('fanotify: groups can specify
their f_flags for new fd'), file descriptors created as part of
file access notification events inherit flags from the
event_f_flags argument passed to syscall fanotify_init(2)[1].

Unfortunately O_CLOEXEC is currently silently ignored.

Indeed, event_f_flags are only given to dentry_open(), which only
seems to care about O_ACCMODE and O_PATH in do_dentry_open(),
O_DIRECT in open_check_o_direct() and O_LARGEFILE in
generic_file_open().

It's a pity, since, according to some lookup on various search
engines and http://codesearch.debian.net/, there's already some
userspace code which use O_CLOEXEC:

- in systemd's readahead[2]:

fanotify_fd = fanotify_init(FAN_CLOEXEC|FAN_NONBLOCK, O_RDONLY|O_LARGEFILE|O_CLOEXEC|O_NOATIME);

- in clsync[3]:

#define FANOTIFY_EVFLAGS (O_LARGEFILE|O_RDONLY|O_CLOEXEC)

int fanotify_d = fanotify_init(FANOTIFY_FLAGS, FANOTIFY_EVFLAGS);

- in examples [4] from "Filesystem monitoring in the Linux
kernel" article[5] by Aleksander Morgado:

if ((fanotify_fd = fanotify_init (FAN_CLOEXEC,
O_RDONLY | O_CLOEXEC | O_LARGEFILE)) < 0)

Additionally, since commit 48149e9d3a7e ('fanotify: check file
flags passed in fanotify_init'). having O_CLOEXEC as part of
fanotify_init() second argument is expressly allowed.

So it seems expected to set close-on-exec flag on the file
descriptors if userspace is allowed to request it with O_CLOEXEC.

But Andrew Morton raised[6] the concern that enabling now
close-on-exec might break existing applications which ask for
O_CLOEXEC but expect the file descriptor to be inherited
across exec().

In the other hand, as reported by Mihai DonÈu[7], not setting
close-on-exec on the file descriptor returned as part of file
access notify can break applications due to deadlock.
So close-on-exec is needed for most applications.

More, applications asking for close-on-exec are likely expecting
it to be enabled, relying on O_CLOEXEC being effective. If not,
it might weaken their security, as noted by Jan Kara[8].

So this patch replaces call to macro get_unused_fd() by a call
to function get_unused_fd_flags() with event_f_flags value as
argument. This way O_CLOEXEC flag in the second argument of
fanotify_init(2) syscall is interpreted and close-on-exec
get enabled when requested.

[1] http://man7.org/linux/man-pages/man2/fanotify_init.2.html
[2] http://cgit.freedesktop.org/systemd/systemd/tree/src/readahead/readahead-collect.c?id=v208#n294
[3] https://github.com/xaionaro/clsync/blob/v0.2.1/sync.c#L1631
https://github.com/xaionaro/clsync/blob/v0.2.1/configuration.h#L38
[4] http://www.lanedo.com/~aleksander/fanotify/fanotify-example.c
[5] http://www.lanedo.com/2013/filesystem-monitoring-linux-kernel/
[6] http://lkml.kernel.org/r/20141001153621.65e9258e65a6167bf2e4cb50@xxxxxxxxxxxxxxxxxxxx
[7] 20141002095046.3715eb69@mdontu-l">http://lkml.kernel.org/r/20141002095046.3715eb69@mdontu-l
[8] http://lkml.kernel.org/r/20141002104410.GB19748@xxxxxxxxxxxxx

Link: http://lkml.kernel.org/r/cover.1411562410.git.ydroneaud@xxxxxxxxxx
Cc: Mihai DonÈu <mihai.dontu@xxxxxxxxx>
Cc: PÃdraig Brady <P@xxxxxxxxxxxxxx>
Cc: Heinrich Schuchardt <xypron.glpk@xxxxxx>
Cc: Jan Kara <jack@xxxxxxx>
Cc: Valdis Kletnieks <Valdis.Kletnieks@xxxxxx>
Cc: Michael Kerrisk-manpages <mtk.manpages@xxxxxxxxx>
Cc: Lino Sanfilippo <LinoSanfilippo@xxxxxx>
Cc: Richard Guy Briggs <rgb@xxxxxxxxxx>
Cc: Eric Paris <eparis@xxxxxxxxxx>
Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx
Cc: linux-api@xxxxxxxxxxxxxxx
Reviewed-by: Jan Kara <jack@xxxxxxx>
Reviewed by: Heinrich Schuchardt <xypron.glpk@xxxxxx>
Tested-by: Heinrich Schuchardt <xypron.glpk@xxxxxx>
Signed-off-by: Yann Droneaud <ydroneaud@xxxxxxxxxx>
---
Hi Andrew,

> Fair enough, it sounds like the risk is acceptable.
>

OK.

> Can we get a new version sent out with all this new info appropriately
> changelogged?
>

Of course !

Please find an updated patch with revamped commit message.

Changes from v8.1:

- added more Cc:
- added Reviewed-by:
- rewrote commit message.

fs/notify/fanotify/fanotify_user.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index b13992a41bd9..c991616acca9 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -78,7 +78,7 @@ static int create_fd(struct fsnotify_group *group,

pr_debug("%s: group=%p event=%p\n", __func__, group, event);

- client_fd = get_unused_fd();
+ client_fd = get_unused_fd_flags(group->fanotify_data.f_flags);
if (client_fd < 0)
return client_fd;

--
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/