Re: UTF-8 filenames

From: Norman Diamond
Date: Sun Feb 22 2004 - 18:37:12 EST


Jamie Lokier replied to me:

> > Consider
> > converting all your ASCII filenames to UTF-16. Let everyone share the
> > short-term pain for the long-term gain. When you get everyone to agree on
> > UTF-16, it will be ugly, but it will be equal for everyone.
>
> UTF-8 is the only sane universal encoding in unix.

That's a bit beside the point. I was replying to the assertion that
everyone agreed to use UTF-8. (And particularly, for large character sets.)

> UTF-16 is not an option;

Of course. Perhaps my use of reductio al absurdum was unclear. I was
trying to show that UTF-8, despite its sanity, is not universally agreeable.
The actual reason is because it came late to the scene (around 20 years ago)
and it is not backwards compatible. But to make the point, I compared it
with UTF-16 which is equally not universally agreeable.

> it's not POSIX compatible,

OK, UTF-8 has one less reason than UTF-16 has, for being not universally
agreeable. But the biggest reason still remains, as mentioned above.

> > By the way, another subthread mentioned that stty puts some stuff in the
> > kernel that could be done in user space. In Unix systems the same is true
> > for IMEs, stty options specify the encoding of the output of an IME (e.g.
> > EUC-JP or SJIS, which then gets forwarded as input to shells, applications,
> > etc.), and whether a single backspace (or whatever character deletion
> > character) deletes an entire input character instead of just deleting a
> > single byte, etc. I keep forgetting to see if Linux has the same stty
> > options. I haven't needed to set them with stty because if I need to use a
> > different locale then I just open a new terminal emulator window using that
> > locale.
>
> Do you have a list or description of the specific stty options that
> are used?

Well, I thought I described them as I saw them used in Unix. I no longer
have access to machines running commercial Unix systems, but some of the
stty options were the way I did describe. I have a feeling that System V
might have implemented them slightly differently from BSD-based systems, but
regardless, the same functionality was pretty much "universally" needed and
implemented.

If you're asking whether I noticed similar stty options in Linux, I didn't
notice because of the reason mentioned (I just opened another terminal
emulator window using the locale that I temporarily needed). But I'll try
to remember to look next weekend. Sorry, I'm leaving for work in a minute
and can't look now.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/