[RFC PATCH] VFS: Simplify truncate logic in do_filp_open()

From: Valerie Aurora
Date: Mon Feb 22 2010 - 15:36:00 EST


The fact that may_open() could truncate a file gave me a lot of
heartburn when working on union mounts, so I was thrilled to see that
truncate handling has been moved out of may_open() in Al's for-next
tree. However, it seems to me that the surrounding elaborate
mnt_want_write() dance is no longer needed? If so, this also
simplifies Ogawa Hirofumi "Fix use-after-free of vfsmount by
mnt_drop_write()" patch.

Against Al's for-next branch. Lightly tested, please review.

-VAL

Author: Valerie Aurora <vaurora@xxxxxxxxxx>

Formerly, may_open() could truncate a file, necessitating an explicit
mnt_want_write() to avoid a nasty race. Now truncation is done in
handle_truncate() after the nameidata_to_filp() call, which already
takes a mount write reference. Remove unneeded extra mnt_want_write()
logic and rewrite the comment explaining the potential race in more
general terms.

Signed-off-by: Valerie Aurora <vaurora@xxxxxxxxxx>
Cc: Alexander Viro <viro@xxxxxxxxxxxxxxxxxx>
---
fs/namei.c | 41 +++++++++++------------------------------
1 files changed, 11 insertions(+), 30 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 2faaaeb..a38801d 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1630,7 +1630,6 @@ struct file *do_filp_open(int dfd, const char *pathname,
struct path path;
struct dentry *dir;
int count = 0;
- int will_truncate;
int flag = open_to_namei_flags(open_flag);
int force_reval = 0;

@@ -1795,28 +1794,10 @@ do_last:
if (S_ISDIR(path.dentry->d_inode->i_mode))
goto exit;
ok:
- /*
- * Consider:
- * 1. may_open() truncates a file
- * 2. a rw->ro mount transition occurs
- * 3. nameidata_to_filp() fails due to
- * the ro mount.
- * That would be inconsistent, and should
- * be avoided. Taking this mnt write here
- * ensures that (2) can not occur.
- */
- will_truncate = open_will_truncate(flag, nd.path.dentry->d_inode);
- if (will_truncate) {
- error = mnt_want_write(nd.path.mnt);
- if (error)
- goto exit;
- }
+ /* may_open() no longer truncates file, handle_truncate() does */
error = may_open(&nd.path, acc_mode, open_flag);
- if (error) {
- if (will_truncate)
- mnt_drop_write(nd.path.mnt);
+ if (error)
goto exit;
- }
filp = nameidata_to_filp(&nd);
if (!IS_ERR(filp)) {
error = ima_file_check(filp, acc_mode);
@@ -1828,8 +1809,15 @@ ok:
if (!IS_ERR(filp)) {
if (acc_mode & MAY_WRITE)
vfs_dq_init(nd.path.dentry->d_inode);
-
- if (will_truncate) {
+ /*
+ * Be sure to get a write reference to the mount
+ * before truncating the file (nameidata_to_filp()
+ * does this). Otherwise, a rw -> ro transition
+ * between the truncate and finishing the open could
+ * result in successfully truncating file but failing
+ * the open() with EROFS.
+ */
+ if (open_will_truncate(flag, nd.path.dentry->d_inode)) {
error = handle_truncate(&nd.path);
if (error) {
fput(filp);
@@ -1837,13 +1825,6 @@ ok:
}
}
}
- /*
- * It is now safe to drop the mnt write
- * because the filp has had a write taken
- * on its behalf.
- */
- if (will_truncate)
- mnt_drop_write(nd.path.mnt);
if (nd.root.mnt)
path_put(&nd.root);
return filp;
--
1.5.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/