Re: Long file names in VFAT broken with iocharset=utf8

From: Andrey Borzenkov
Date: Wed May 09 2007 - 06:27:42 EST


On Monday 07 May 2007, Roland Kuhn wrote:
> Hi!
>
> On 7 May 2007, at 20:27, OGAWA Hirofumi wrote:
> > Roland Kuhn <rkuhn@xxxxxxxxxxxxxxxxxxxxxxxxx> writes:
> >> PATH_MAX specifically counts _bytes_ not characters, so UTF-8 does
> >> not matter. ISTR that PATH_MAX was 256 at some point, but I just
> >> quickly grepped /usr/include and found various mention of 4096, so
> >> where's the central repository for this configuration item? A hard-
> >> coded value of 256 somewhere inside the kernel smells like a bug.
> >
> > There is a nasty issue here. FAT is limited by 255 unicode chars or
> > so.
> > So, we would need to count number of unicode chars of filename.
>
> No, we don't. At least not when looking at the POSIX spec, which
> explicitly mentions _bytes_ and _not_ unicode characters. So, to be
> on the safe side, FAT filesystems would need to support a NAME_MAX of
> roughly 6*255+3=1533 bytes (not to mention the hassles of forbidden
> sequences, etc.; do we need to count zero-width characters?)

How is this issue related to character *width* at all?

> and
> report it through pathconf() to userspace, then userspace could do
> with that whatever it liked.
>
> What happened to: "file names are just sequences of octets, excluding
> '/' and NUL"? Adding unicode parsing to the kernel is completely
> useless _and_ a big trouble maker.
>

Who speaks about unicode parsing? UCS2 - UTF-8 transformation does and
requires no parsing; this is simply conversion between on-disk and in-kernel
representation (like endian conversion). Anyway we are doing it now already;
how support for larger name length limit changes it?

-andrey

Attachment: pgp00000.pgp
Description: PGP signature