Unicode, kernels, Linux and NT

Stephen Williams (steve@icarus.icarus.com)
Fri, 17 Apr 1998 22:17:42 -0600


(I know I'm asking for trouble, but...)

> However, the last thing we want to do is implement utf-16 in the
> kernel because microsoft/apple says that they're going to someday.

True enough. Read on.

acahalan@cs.uml.edu said:
> Microsoft already did. Library support lets app developers choose to
> use or ignore that support as they desire. As time passes, more will
> choose to use the support.

The kernel-mode functions in Windows NT take UNICODE. Period. There is
no choice in the matter. And this is for entirely kernel matters like
the name of the device object.

I can (maybe) see it for logged messages, but for Pete's sake, this is
ridiculous. One person complained that C was too much of an
"assembly language" because of its character set/string handling. Well,
that may be true (I remember at IBM I did some kernel work in PascalVM--
actually worked pretty well) but frankly there is no reason to complain
about the expressiveness of C/C++ when doing kernel/device driver/embedded
device programming. I see no reason to mess it up with a character set
that makes C/C++ grumpy.

So...

Here is what I do for a linux driver to register a device:

if (register_chrdev(ucr_major, "ise", &ucr_ops)) {
printk("ise: unable to get major number %d\n", ucr_major);
return -EIO;
}

I even check for errors. "Good programmer, you get a programmer biscuit."

And under Windows NT for exactly the same device. (Notice the UNICODE
handling shit^h^h^htuff):

#define dev_name_str L"\\Device\\ise"
#define dev_link_str L"\\DosDevices\\ISE"

[...]

UNICODE_STRING dev_name, dev_link;
wchar_t dev_name_buf[sizeof(dev_name_str) + 10];
wchar_t dev_link_buf[sizeof(dev_link_str) + 10];
UNICODE_STRING number;
wchar_t number_buf[10];

[...]

/* Make the kernel device name (\Device\ise0) from the prefix
and the current instance number. Lots of UNICODE kruft. */
dev_name.Buffer = dev_name_buf;
dev_name.MaximumLength = sizeof(dev_name_buf);
dev_name.Length = 0;
RtlZeroMemory(dev_name_buf, sizeof dev_name_buf);
RtlAppendUnicodeToString(&dev_name, dev_name_str);

number.Buffer = number_buf;
number.MaximumLength = sizeof(number_buf);
number.Length = 0;
RtlZeroMemory(number_buf, sizeof number_buf);
RtlIntegerToUnicodeString(num, 10, &number);

RtlAppendUnicodeStringToString(&dev_name, &number);

/* Make the kernel device object and initialize the instance
structure with the basics. */
status = IoCreateDevice(drv, sizeof(struct Instance), &dev_name,
FILE_DEVICE_ISE, 0, FALSE, &dev);
if (! NT_SUCCESS(status)) return status;

[...]

/* Make the device accessible to WIN32 by creating a link in
\DosDevices to this driver. */
dev_link.Buffer = dev_link_buf;
dev_link.MaximumLength = sizeof dev_link_buf;
dev_link.Length = 0;
RtlZeroMemory(dev_link_buf, sizeof dev_link_buf);
RtlAppendUnicodeToString(&dev_link, dev_link_str);

RtlAppendUnicodeStringToString(&dev_link, &number);
status = IoCreateSymbolicLink(&dev_link, &dev_name);
if (! NT_SUCCESS(status)) {
printk("Failed to create device symbolic link.\n");
}

Now it is certainly true that a proper internationalized application should
most certainly *not* include code like in the Linux example above, but
then again the NT example is not the least bit internationalized either.
All I got for this UNICODE_STRING data type is a lot of headaches. Will
the fact that I used a not-quite-ascii string matter one wit to the Korean
user? I think not!

So I wonder which version our European/Asian/Martian friends would rather
write?

-- 
Steve Williams                "The woods are lovely, dark and deep.
steve@icarus.com              But I have promises to keep,
steve@picturel.com            and lines to code before I sleep,
http://www.picturel.com       And lines to code before I sleep."

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu