Re: [PATCH] Smackv10: Smack rules grammar + their stateful parser

From: Tetsuo Handa
Date: Tue Nov 06 2007 - 10:27:23 EST


Hello.

Adrian Bunk wrote:
> The problem is that your code matches one byte, not one character.
>
> More or less all userspace programs handle multi-byte UTF-8 characters
> just fine without bothering the user with the fact whether a character
> consists of one or more bytes.
I understood what you are saying.

You are saying "a character" does not always consist of one byte,
while I'm saying "a character" does always consist of one byte.

Yes, some userspace programs don't use strcmp()
since strcmp() can't handle some encodings like UTF-16.
But the kernel uses strcmp()
since the VFS related functions can't handle encodings
which contains '\0' in the pathname.
VFS related functions assume that '\0' is end-of-string marker.

So, from the point of view of userland programs,
'\?' should match to a single character (which depends on encoding).
But from the point of view of the kernel,
'\?' should match to a single byte (which doesn't depend on encoding).
Handling all possible encoding in the kernel is too difficult to implement.
I'll continue using '\?' matches to a single byte.

> And users will try to use this \? for matching one character when
> writing a pattern that denies access.
Yes, but since this string is handled by the *kernel*,
I want users follow point of view of the kernel.

Thanks.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/