From: Theodore Y. Ts'o
Date: Wed Jan 01 2020 - 13:11:11 EST

On Tue, Dec 31, 2019 at 04:54:18PM -0600, Eric Sandeen wrote:
> > Because I was not able to find any documentation for it, what is format
> > of passed buffer... null-term string? fixed-length? and in which
> > encoding? utf-8? latin1? utf-16? or filesystem dependent?
> It simply copies the bits from the memory location you pass in, it knows
> nothing of encodings.
> For the most part it's up to the filesystem's own utilities to do any
> interpretation of the resulting bits on disk, null-terminating maximal-length
> label strings, etc.

I'm not sure this is going to be the best API design choice. The
blkid library interprets the on disk format for each file syustem
knowing what is the "native" format for that particular file system.
This is mainly an issue only for the non-Linux file systems; for the
Linux file system, the party line has historically been that we don't
get involved with character encoding, but in practice, what that has
evolved into is that userspace has standardized on UTF-8, and that's
what we pass into the kernel from userspace by convention.

But the problem is that if the goal is to make FS_IOC_GETFSLABEL and
FS_IOC_SETFSLABEL work without the calling program knowing what file
system type a particular pathname happens to be, then it would be
easist for the userspace program if it can expect that it can always
pass in a null-terminated UTF-8 string, and get back a null-terminated
UTF-8. I bet that in practice, that is what most userspace programs
are going to be do anyway, since it works that way for all other file
system syscalls.

So for a file system which is a non-Linux-native file system, if it
happens to store the its label using utf-16, or some other
Windows-system-silliness, it would work a lot better if it assumed
that it was passed in utf-8, and stored in the the Windows file system
using whatever crazy encoding Windows wants to use. Otherwise, why
bother uplifting the ioctl to one which is file system independent, if
the paramters are defined to be file system *dependent*?

- Ted