Re: [RFC PATCH 19/35] ovl: readd reflink/copyfile/dedup support

From: Amir Goldstein
Date: Wed Apr 18 2018 - 04:40:54 EST


On Tue, Apr 17, 2018 at 11:31 PM, Amir Goldstein <amir73il@xxxxxxxxx> wrote:
> On Thu, Apr 12, 2018 at 6:08 PM, Miklos Szeredi <mszeredi@xxxxxxxxxx> wrote:
>> Since set of arguments are so similar, handle in a common helper.
>>
>> Signed-off-by: Miklos Szeredi <mszeredi@xxxxxxxxxx>
>> ---
>> fs/overlayfs/file.c | 79 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 79 insertions(+)
>>
>> diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
>> index 9670e160967e..39b1b73334ad 100644
>> --- a/fs/overlayfs/file.c
>> +++ b/fs/overlayfs/file.c
>> @@ -352,6 +352,81 @@ long ovl_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
>> return ret;
>> }
>>
>> +enum ovl_copyop {
>> + OVL_COPY,
>> + OVL_CLONE,
>> + OVL_DEDUPE,
>> +};
>> +
>> +static ssize_t ovl_copyfile(struct file *file_in, loff_t pos_in,
>> + struct file *file_out, loff_t pos_out,
>> + u64 len, unsigned int flags, enum ovl_copyop op)
>> +{
>> + struct inode *inode_out = file_inode(file_out);
>> + struct fd real_in, real_out;
>> + const struct cred *old_cred;
>> + int ret;
>> +
>> + ret = ovl_real_file(file_out, &real_out);
>> + if (ret)
>> + return ret;
>> +
>> + ret = ovl_real_file(file_in, &real_in);
>> + if (ret) {
>> + fdput(real_out);
>> + return ret;
>> + }
>> +
>> + old_cred = ovl_override_creds(file_inode(file_out)->i_sb);
>> + switch (op) {
>> + case OVL_COPY:
>> + ret = vfs_copy_file_range(real_in.file, pos_in,
>> + real_out.file, pos_out, len, flags);
>
> Problem:
> vfs_copy_file_range(ovl_lower_file, ovl_upper_file) on non samefs
> will get -EXDEV from ovl_copy_file_range(), so will not fall back
> to do_splice_direct().
> We may be better off checking in_sb != out_sb and returning
> -EOPNOTSUPP? not sure.
>
>
>> + break;
>> +
>> + case OVL_CLONE:
>> + ret = vfs_clone_file_range(real_in.file, pos_in,
>> + real_out.file, pos_out, len);
>> + break;
>> +
>> + case OVL_DEDUPE:
>> + ret = vfs_dedupe_file_range_one(real_in.file, pos_in, len,
>> + real_out.file, pos_out);
>
> Problem:
> real_out can be a readonly fd (for is_admin), so we will be deduping
> the lower file.
> I guess this problem is mitigated in current code by may_write_real().
>
> How can we deal with that sort of "write leak" without patching
> mnt_want_write_file()?
>

Possible solution:
Add interface file_oprations->permission().

At least in rw_verify_area() and clone_verify_area() it is clear
how this would be used. Instead if calling security_file_permission()
call it via a helper file_permission() like with inode_permission.

My understanding in the VFS permission checks model is too
limited to say if this makes sense.

Thanks,
Amir.