Re: (reiserfs) Re: reiserfs and knfsd and NFSv4 and volatile file handles

From: Chris Mason (mason@suse.com)
Date: Mon Mar 20 2000 - 13:32:54 EST


[ NFS list removed from the cc ]

On Sat, 18 Mar 2000, Hans Reiser wrote:

> I leave the below for Alexei and Vladimir to answer in detail. Generally
> speaking, we can only do clean effective SMP after implementing per buffer
> seals. I see that as unlikely to make it into 2.4 due to it requiring a per
> buffer/page fs specific struct, but it is actively being coded by Zarochentcev
> and Roma. If 2.4 is slow enough in coming out, maybe we will submit it just in
> case Linus will take it. Our SMP will be poor until we have it.
>

Our reads are hit the most by this, as ext2 can read without the big
kernel lock held, and we can't. But, Hans, I thought we had some unused
bits in the block_head struct that could be used to make the seals. That
would allow us to fully thread the tree access, without changing the
buffer head or page structs.

> > Umm... I'd still like to hear a description of your internal locking. In
> > particular, what do you lock/release upon reiserfs_find_entry()/pathrelse()?
> > When do you rebalance the tree? How do you do serialization between that
> > and things a-la write_inode()/readdir()/lookup()? Currently locking is
> > masked by the VFS one and that's one of the reasons why I want to see
> > cleaned variant.
>
Well, it is a bit ugly. The tree is balanced anytime things are added,
removed, or resized. Sometimes this doesn't actually require shifting
data around, we have early exits for that sort of thing. Vladimir, please
correct me if I'm wrong here:

We are acting inside the big kernel lock, and we build a struct of all the
nodes that need to be a part of the balance. If we schedule while
building that struct, we check to see if another balance happened during
the schedule (via generation counter). Once we've gathered the
information required, the balance is done, in one big schedule free loop.

Simply put, this sucks. I would really like to see the per buffer locks,
either with free bits in our on disk structures, or with a locking bit in
the buffer head (the existing one, or one we add). Then nodes could be
locked in tree depth order, and our balancing code could be much
cleaner.

-chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Mar 23 2000 - 21:00:30 EST