Re: [PATCH 2/2] fat (exportfs): reconnect file handles to evictedinodes/dentries

From: Steve Magnani
Date: Wed Jul 04 2012 - 14:11:52 EST


OGAWA Hirofumi <hirofumi@xxxxxxxxxxxxxxxxxx> writes:
> "Steven J. Magnani" <steve@xxxxxxxxxxxxxxx> writes:
>
> > This patch adds code to support reconstruction of evicted inodes.
> > We walk the on-disk structures where necessary to fill in information
> > not available in the NFS file handle.
> >
> > One important point is that when reconstructing an inode, in order to
> > avoid the *client* declaring ESTALE we have to ensure that the NFS
> > file handle of the reconstruction is identical to that of the
> > original.

> We would like to add the comment of [0/2] explanation here. Here is
> missing the explanation of the problem.

OK

> And the codes of nfs support became larger, and I think we should give
> the place for it. I.e. we would like to add new file (maybe, export.c,
> nfs.c, or something), move nfs code into it.
>
> With it, we don't need to add "NFS ..." annotation in the comment, and
> it would make more clear.

Sure.

> And can you add explanation the test of this? What were tested?

I set up a memory-limited virtual machine with a 2 GB FAT partition
containing a kernel tree (~770 MB, ~40000 files, 9 levels) and did some
'cp -r' and 'ls -lR' operations on it, some overlapping, some not.

> > +int fat_get_name(struct dentry *parent, char *name,
> > + struct dentry *child)
> > +{
> > + struct super_block *sb = parent->d_inode->i_sb;
> > +
> > + loff_t child_loc = MSDOS_I(child->d_inode)->i_pos;
> > + int err;
> > +
> > + lock_super(sb);
> > +
> > + if (child_loc) {
> > + err = fat_name_for_ipos(parent->d_inode, name, NAME_MAX+1,
> > + child_loc);
> > + } else {
> > + child_loc = MSDOS_I(child->d_inode)->i_logstart;
> > + err = fat_name_for_logstart(parent->d_inode, name,
NAME_MAX+1,
> > + child_loc);
> > + }
> > +
> > + unlock_super(sb);
> > + return err;
> > +}
>
> Please don't add new lock_super() usage if it is not necessary. Almost
> all of lock_super() just replaced lock_kernel() usage. It rather should
> be removed in future. Probably, this should use inode->i_mutex
> instead.

I will look into this. My concern was freezing the filesystem while we
walk the on-disk structures. Also I developed this patch against 2.6.35
(the Bad Old BKL days) and ported it forward to 3.5.

> BTW, the above issue is same with all of directory read.
>
> And although this is using i_pos, is there no possibility to be passed
> the detached inode (i.e. open but unlinked inode, i_pos == 0)?

It is possible, that's why I added code to fall back to using logstart.

I may yet rip out the get_name code. The testing I did before posting the
patch seemed to indicate that it was needed - I saw ESTALE errors without
get_name support that I did not see with it present. But I've been
digging into this some more and I think that was just a coincidence;
probably I just generated more extreme memory pressure when testing
without get_name. I should know more tomorrow.

Thanks for the quick feedback.
Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/