Re: UTF-8 and case-insensitivity
From: Andy Lutomirski
Date: Tue Feb 17 2004 - 20:12:29 EST
Linus Torvalds wrote:
int magic_open(
/* Input arguments */
const char *pathname,
unsigned long flags,
mode_t mode,
/* output arguments */
int *fd,
struct stat *st,
int *successful_path_length);
ie the system call would:
- look up as far into the pathname (using _exact_ lookup) as possible
- return the error code of the last failure
- the "flags" could be extended so that you can specify that you mustn't
traverse ".." or symlinks (ie those would count as failures)
but also:
- fill in the "struct stat" information for the last _successful_
pathname component.
- fill in the "fd" with a fd of the last _successful_ pathname component.
- tell how much of the pathname it could traverse.
Aside from just case-insensitivity, I imagine this could give lots of other
benefits:
- file servers that don't want to follow symlinks can do it quickly.
- Apache could serve things like http://www.foo.com/a/b/c/d.php/e/f/g a lot
faster.
- a flag to avoid traversing mountpoints could help someone
- a flag for root to see _through_ mountpoints would make it possible to clean
up initramfs and such that got mounted over, or to do other useful and currently
impossible tasks. (e.g. I could see what's under my devfs mount...)
I would be nice to see this added even if it's not the perfect solution for samba :)
BTW, here's a thought for solving samba's negative lookup problem:
int ugly_stat(char *pattern, struct stat *st, char *match_out)
Pattern would be some description of what the filename should look like.
Something like:
- pattern is an array of slash-delimited groups of characters separated by nulls
and terminated by two nulls. For example, ugly_stat("F/f\0O/o\0O/o\0\0", ...)
finds a file called foo, case-insensitively in English, while
ugly_stat("F\0i\0l\0e\011/22/33") finds "File" followed by either 11, 22, or 33.
- the dcache problem is easy: don't use it. All Andrew wants (I think) is proof
that there is no such file or the name if there is one. Samba can cache it
itself; I don't think the kernel should involve itself in trying to cache this.
- ugly_stat does not traverse directories -- that's why the slash trick is safe.
- st gets the stat data, and match_out gets the filename if any
- if there are multiple matches, one is arbitrarily selected.
If the file-system doesn't have specific support for this, then either VFS or
the caller could emulate it (probably VFS -- it would avoid lots of syscalls).
Would ugly_stat + magic_open be sufficient?
--Andy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/