Re: explicit dcache <-> user-space cache coherency, sys_mark_dir_clean(), O_CLEAN

From: Jamie Lokier
Date: Fri Feb 20 2004 - 12:34:40 EST


Linus Torvalds wrote:
> > One 'user-space cache is valid/clean' bit should be enough - where all
> > non-Samba accesses clear the 'valid bit', and Samba sets the bit
> > manually.
>
> Yes, that, together with O_CLEAN would work.
>
> The problem is that you'd still need other system calls: it's not like
> open(O_CREAT) is the only way to create a file. So you'd have to add
> versions of "link()" etc, which means that O_CLEAN is really pretty
> pointless, and you might as well just do it in a new system call.
>
> Your version is also not multi-threaded: you can never allow more than one
> thread doing the "sys_mark_dir_clean()". That was the reason for having
> two bits: so that anybody can do a lookup in parallell, and only the
> "filldir" part needs to be serialized.

How about this: we clean up dnotify, so it can be used for
user<->kernel dcache coherency, efficiently, and implement O_CLEAN in
a different way, which works with multiple threads, without extra
system calls for rename/link etc. and where the scope of "this
process/thread/whatever doesn't make a directory unclean" is flexible
enough for Samba, multi-threaded file viewers, maildir mail trackers
and so on.

Ok, marketing aside:

1. open() a directory
2. fcntl() for dnotify on the directory, as with the current interface,
but adding a flag call DN_POLL.

Normally dnotify sends a queued signal with each event. It can listen
in one-short or multiple event modes. I'm surprised a signal was ever
used, because we have a perfectly good file descriptor to read from.

So DN_POLL means "register this dnotify but don't send a signal".
Instead, you'll call fcntl() again to read the dnotify status bits.

For Samba, the dnotify is equivalent to sys_mark_dir_clean().
Samba's dcache works like this, following Ingo's logic:

repeat:
if (fcntl(dirfd, F_NOTIFY, DN_CREATE | DN_RENAME | DN_POLL) == 0) {
... pure user-space fast path, use Samba dcache ...
return;
}
... fill Samba dcache ...
readdir() loop
goto repeat;

(Note that DN_DELETE isn't needed: the negative userspace dcache
doesn't care about deletions).

See, that is obviously equivalent and uses an obvious (and tiny)
improvement to the existing dnotify feature.

The argument that O_CLEAN _requires_ sys_mark_dir_clean() is obviously
bogus: if O_CLEAN will abort when _this_ process/thread/whatever has
the directory marked clean, than it can just as easily abort when
this process/thread/whatever has a dnotify listening.

We might, however, like to add a flag DN_CLEAN so that O_CLEAN only
aborts for dnotifies with that flag set. Just to stay friendly with
libraries.

-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/