Re: [PATCH] [REGRESSION] ovl: Handle ENOSYS when fileattr support is missing in lower/upper fs

From: Linus Torvalds
Date: Mon Jul 18 2022 - 14:30:08 EST


On Mon, Jul 18, 2022 at 6:13 AM Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
>
> Correct. The question is whether any application would break in this
> case. I think not, but you are free to prove otherwise.

Most often, an error is "just an error", and most applications usually
won't care.

There are exceptions: some errors are very much "do something special"
(eg EAGAIN or EINTR _are_ often separately tested for and often mean
"just retry"). And permission error handling is often different from
EINVAL etc.

And ENOSYS can easily be such an error - people probing whether they
are running on a new kernel that supports a new system call or not.

And yeah, some of our ioctl's are odd, and we have a lot of drivers
(and driver infrastructure) that basically does "this device does not
support this ioctl, so return ENOSYS".

I don't think that's the right thing to do, but I think it's
understandable. The traditional error for "this device does not
support this ioctl" is ENOTTY, which sounds so crazy to non-tty people
that I understand why people have used ENOSYS instead.

It's sad that it's called "ENOTTY" and some (at least historical)
strerror() implementations will indeed return "Not a tty". Never mind
that modern ones will say "inappropriate ioctl for device" - even when
the string has been updated, the error number isn't called
EINAPPROPRAITEDEVICE.

But it is what it is, and so I think ENOTTY is understandably not used
in many situations just because it's such a senseless historical name.

And so if people don't use ENOSYS, they use EINVAL.

I *suspect* no application cares: partly because ioctl error numbers
are so random anyway, but also very much if this is a "without
overlayfs it does X, with overlayfs it does Y".

The sanest thing to do is likely to make ovl match what a non-ovl
setup would do in the same situation (_either_ of the overlaid
filesystems - there might be multiple cases).

But I'm missing the original report. It sounds like there was a
regression and we already have a case of "changing the error number
broke something". If so, that regression should be fixed.

In general, I'm perfectly happy with people fixing error numbers and
changing them.

The only thing I require is that if those cleanups and fixes are
reported to break something, people quickly revert (and preferably add
a big comment about "Use *this* error number, because while this
*other* error number would make sense, application XyZ expects AbC"..)

Linus