Re: [PATCH 1/6] fs: Add flag to file_system_type to indicate content is generated

From: Greg KH
Date: Fri Feb 12 2021 - 03:40:29 EST


On Fri, Feb 12, 2021 at 10:22:16AM +0200, Amir Goldstein wrote:
> On Fri, Feb 12, 2021 at 9:49 AM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> >
> > On Fri, Feb 12, 2021 at 12:44:00PM +0800, Nicolas Boichat wrote:
> > > Filesystems such as procfs and sysfs generate their content at
> > > runtime. This implies the file sizes do not usually match the
> > > amount of data that can be read from the file, and that seeking
> > > may not work as intended.
> > >
> > > This will be useful to disallow copy_file_range with input files
> > > from such filesystems.
> > >
> > > Signed-off-by: Nicolas Boichat <drinkcat@xxxxxxxxxxxx>
> > > ---
> > > I first thought of adding a new field to struct file_operations,
> > > but that doesn't quite scale as every single file creation
> > > operation would need to be modified.
> >
> > Even so, you missed a load of filesystems in the kernel with this patch
> > series, what makes the ones you did mark here different from the
> > "internal" filesystems that you did not?
> >
> > This feels wrong, why is userspace suddenly breaking? What changed in
> > the kernel that caused this? Procfs has been around for a _very_ long
> > time :)
>
> That would be because of (v5.3):
>
> 5dae222a5ff0 vfs: allow copy_file_range to copy across devices
>
> The intention of this change (series) was to allow server side copy
> for nfs and cifs via copy_file_range().
> This is mostly work by Dave Chinner that I picked up following requests
> from the NFS folks.
>
> But the above change also includes this generic change:
>
> - /* this could be relaxed once a method supports cross-fs copies */
> - if (file_inode(file_in)->i_sb != file_inode(file_out)->i_sb)
> - return -EXDEV;
> -
>
> The change of behavior was documented in the commit message.
> It was also documented in:
>
> 88e75e2c5 copy_file_range.2: Kernel v5.3 updates
>
> I think our rationale for the generic change was:
> "Why not? What could go wrong? (TM)"
> I am not sure if any workload really gained something from this
> kernel cross-fs CFR.

Why not put that check back?

> In retrospect, I think it would have been safer to allow cross-fs CFR
> only to the filesystems that implement ->{copy,remap}_file_range()...

Why not make this change? That seems easier and should fix this for
everyone, right?

> Our option now are:
> - Restore the cross-fs restriction into generic_copy_file_range()

Yes.

> - Explicitly opt-out of CFR per-fs and/or per-file as Nicolas' patch does

No. That way lies constant auditing and someone being "vigilant" for
the next 30+ years. Which will not happen.

thanks,

greg k-h