Re: UTF-8, OSTA-UDF [why?], Unicode, and miscellaneous gibberish

Alex Belits (abelits@phobos.illtel.denver.co.us)
Wed, 20 Aug 1997 07:59:30 -0700 (PDT)


On Wed, 20 Aug 1997, Peter Holzer wrote:

> > It should be possible to _choose_ mapping as the mount option, not
> >"UTF-8 or all filenames will be truncated to the first letter because
> >second one is zero".
>
> You are mixing up 16-Bit Unicode and UTF-8 here. In UTF-8, Unicode
> characters 0000 to 007f are mapped to single bytes with the same value.
> All other codes are mapped to multi-byte sequences where all bytes have
> the MSB set.

But if the only alternatives will be UTF-8 or "no translation at all",
that will leave only UTF-8 usable -- taking plain ASCII filename in the
form how it's stored on NTFS (16-bt Unicode) produces a string,
unusuitable for any string processing. IMHO if one wants to support such a
thing, replaceable name-translation interfaces should be used, not
hardcoded UTF-8.

--
Alex