Hi,
On Thu, 31 Aug 2000, Alexander Viro wrote:
> Go ahead, write it. IMNSHO it's going to be much more complicated and
> race-prone, but code talks. If you will manage to write it in clear and
> race-free way - fine. Frankly, I don't believe that it's doable.
It will be insofar more complicated, as I want to use a more complex state
machine than "locked <-> unlocked", on the other hand I can avoid such
funny constructions as triple_down() and obscure locking order rules.
At any time the object will be either locked or in a well defined state,
where at any time only a single object is locked by a thread. (I hope some
pseudo code does for the beginning, too?) Most namespace operation work
simply like a semaphore:
restart:
lock(dentry);
if (dentry is busy) {
unlock(dentry);
sleep();
goto restart;
}
dentry->state = busy;
unlock(dentry);
If the operation is finished, the state is reset and everyone sleeping is
woken up. Ok, let's come to the most interesting operation - rename():
restart:
lock(olddentry);
if (olddentry is busy) {
unlock(olddentry);
sleep();
goto restart;
}
olddentry->state = moving;
unlock(olddentry);
restart2:
lock(newdentry);
if (newdentry->state == moving) {
lock(renamelock);
if (olddentry->state == deleted) {
unlock(renamelock);
unlock(newdentry);
sleep();
goto restart;
}
newdentry->state = deleted;
unlock(renamelock);
} else if (newdentry is busy) {
unlock(newdentry);
sleep();
goto restart2;
} else
newdentry->state = deleted;
unlock(newdentry);
if (!rename_valid(olddentry, newdentry)) {
lock(newdentry);
newdentry->state = idle;
unlock(renamelock);
lock(olddentry);
olddentry->state = idle;
unlock(olddentry);
wakeup_sleepers();
return;
}
if (newdentry exists)
unlink(newdentry);
do_rename(olddentry, newdentry);
lock(newdentry);
newdentry->state = idle;
unlock(renamelock);
lock(olddentry);
olddentry->state = deleted;
unlock(olddentry);
wakeup_sleepers();
return;
Note that I don't touch any inode here, everything happens in the dcache.
That means I move the complete inode locking into the fs, all I do here is
to make sure, that while operation("foo") is busy, no other operation will
use "foo".
IMO this should work, I tried it with a rename("foo", "bar") and
rename("bar", "foo"):
case 1: one rename gets both dentries busy, the other rename will wait
till it's finished.
case 2: both mark the old dentry as moving and find the new dentry also
moving. To make the rename atomic the global rename lock is needed, one
rename will find the old dentry isn't moving anymore and has to restart
and wait, the other rename will complete.
Other operations will keep only one dentry busy, so that I don't a see
problem here. If you don't find any major problem here, I'm going to try
this. Since if this works, it will have some other advantages:
- a user space fs will become possible, that can't even deadlock the
system. The first restart loop can be easily made interruptable, so it can
be safely killed. (I don't really want to know how a
triple_down_interruptable() looks, not to mention the other three locks
(+ BKL) taken during a rename.)
- I can imagine better support for hfs. It can access the other fork
without excessive locking (I think currently it doesn't even tries to).
The order in which the forks can be created can change then too.
> BTW, I really wonder what kind of locks are you going to have on _blocks_
> (you've mentioned that, unless I've misparsed what you've said). IMO that
> way lies the horror, but hey, code talks.
I thought about catching a bread, but while thinking about it, there
should also be other ways. But that's fs specific, let's concentrate on
the generic part first.
> You claim that it's doable. I seriously doubt it. Nobody knows your ideas
> better than you do, so... come on, demonstrate the patch.
I think the above example should do basically the same as some nothing
doing patch within affs.
I hope that example shows two important ideas (no idea if they will save
the world, but I'm willing to learn):
- I use the dcache instead of the inode to synchronize namespace
operation, what IMO makes quite a lot of sense, since it represents our
(cached) representation of the fs.
- Using states instead of a semaphore, makes it easily possible to detect
e.g. a rename loop.
bye, Roman
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Thu Sep 07 2000 - 21:00:12 EST