Re: JFS default behavior (was: UTF-8 in file systems?xfs/extfs/etc.)

From: Nicolas Mailhot
Date: Thu Feb 12 2004 - 11:52:04 EST


Not specifying the file name encoding (either per fs type, per partition
or per filename) is plain dangerous. It is not a userspace problem -
flash/hotplug disks move, users on the same system can have different
locales and try to share files, a user can change his locale to another
one (hear the screams of RH users forcibly converted to utf8 which had
to fix years of storage which filenames were suddenly borked)

See also the sun zip encoding bug - everyone uses zip files in Java, zip
authors thought a filename is "just a bunch of bytes" and didn't put
filename encoding info in the zip format, and now java zip handling goes
boom since numerous encodings are unicode-incompatible. It's slowly
getting its way to the top-25 most reported java bugs.

(of course as usual US users/coders are not hit and do not feel
concerned)

The only reason we got by with it so far is linux localisation was poor,
and systems didn't scale high enough to permit high number of users per
system (reducing locale collision risks)

The only reason we might get by in the future is everyone will be using
utf8.

But that's not a reason not to fix the core problem - I don't want to
spent hours fixing filenames next time someone comes up with a new
encoding. Please put valid encoding info somewhere or declare filenames
are utf-8 od utf-16 only - changing user locale should not corrupt old
data.

Cheers,

--
Nicolas Mailhot

Attachment: signature.asc
Description: Ceci est une partie de message=?ISO-8859-1?Q?num=E9riquement?= =?ISO-8859-1?Q?_sign=E9e?=