Re: [PATCH 1/6] fs: Add flag to file_system_type to indicate content is generated

From: Greg KH
Date: Fri Feb 12 2021 - 11:34:28 EST


On Fri, Feb 12, 2021 at 07:59:04AM -0800, Ian Lance Taylor wrote:
> On Fri, Feb 12, 2021 at 7:45 AM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> >
> > On Fri, Feb 12, 2021 at 07:33:57AM -0800, Ian Lance Taylor wrote:
> > > On Fri, Feb 12, 2021 at 12:38 AM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> > > >
> > > > Why are people trying to use copy_file_range on simple /proc and /sys
> > > > files in the first place? They can not seek (well most can not), so
> > > > that feels like a "oh look, a new syscall, let's use it everywhere!"
> > > > problem that userspace should not do.
> > >
> > > This may have been covered elsewhere, but it's not that people are
> > > saying "let's use copy_file_range on files in /proc." It's that the
> > > Go language standard library provides an interface to operating system
> > > files. When Go code uses the standard library function io.Copy to
> > > copy the contents of one open file to another open file, then on Linux
> > > kernels 5.3 and greater the Go standard library will use the
> > > copy_file_range system call. That seems to be exactly what
> > > copy_file_range is intended for. Unfortunately it appears that when
> > > people writing Go code open a file in /proc and use io.Copy the
> > > contents to another open file, copy_file_range does nothing and
> > > reports success. There isn't anything on the copy_file_range man page
> > > explaining this limitation, and there isn't any documented way to know
> > > that the Go standard library should not use copy_file_range on certain
> > > files.
> >
> > But, is this a bug in the kernel in that the syscall being made is not
> > working properly, or a bug in that Go decided to do this for all types
> > of files not knowing that some types of files can not handle this?
> >
> > If the kernel has always worked this way, I would say that Go is doing
> > the wrong thing here. If the kernel used to work properly, and then
> > changed, then it's a regression on the kernel side.
> >
> > So which is it?
>
> I don't work on the kernel, so I can't tell you which it is. You will
> have to decide.

As you have the userspace code, it should be easier for you to test this
on an older kernel. I don't have your userspace code...

> From my perspective, as a kernel user rather than a kernel developer,
> a system call that silently fails for certain files and that provides
> no way to determine either 1) ahead of time that the system call will
> fail, or 2) after the call that the system call did fail, is a useless
> system call.

Great, then don't use copy_file_range() yet as it seems like it fits
that category at the moment :)

> I can never use that system call, because I don't know
> whether or not it will work. So as a kernel user I would say that you
> should fix the system call to report failure, or document some way to
> know whether the system call will fail, or you should remove the
> system call. But I'm not a kernel developer, I don't have all the
> information, and it's obviously your call.

That's up to the authors of that syscall, I don't know what they
intended this to be used for. It feels like you are using this in a
very generic "all file i/o works for this" way, while that is not what
the authors of the syscall intended it for, as is evident by the
failures that older kernels would report for you.

> I'll note that to the best of my knowledge this failure started
> happening with the 5.3 kernel, as before 5.3 the problematic calls
> would report a failure (EXDEV). Since 5.3 isn't all that old I
> personally wouldn't say that the kernel "has always worked this way."
> But I may be mistaken about this.

Testing would be good, as regressions are more serious than "it would be
nice if it would work like this instead!" type of issues.

thanks,

greg k-h