Re: [PATCH] fs: ext4: inode->i_generation not assigned 0.

From: NeilBrown
Date: Wed Jul 05 2017 - 21:09:04 EST


On Wed, Jul 05 2017, J. Bruce Fields wrote:

> On Wed, Jul 05, 2017 at 12:19:33PM -0700, Darrick J. Wong wrote:
>> On Tue, Jul 04, 2017 at 09:15:34PM -0400, J. Bruce Fields wrote:
>> > On Mon, Jul 03, 2017 at 09:04:46PM -0700, Darrick J. Wong wrote:
>> > > On Thu, Jun 29, 2017 at 02:50:22PM -0400, J. Bruce Fields wrote:
>> > > > On Thu, Jun 29, 2017 at 02:30:53PM -0400, J. Bruce Fields wrote:
>> > > > > On Thu, Jun 29, 2017 at 10:25:28AM -0700, Darrick J. Wong wrote:
>> > > > > > Was there ever a version of NFS (or more generally callers of the
>> > > > > > exportfs code) that couldn't deal with i_generation in the file handle,
>> > > > > > and therefore we invented this generation hack to work around the loss
>> > > > > > of the generation information?
>> > > > > >
>> > > > > > There's a comment in xfs_fs_encode_fh about not supporting 64bit inodes
>> > > > > > with subtree_check (which seems to require one ino/gen pair for the file
>> > > > > > and a second pair for the file's parent) on NFSv2 because v2 doesn't
>> > > > > > provide enough space for all the file handle information, but that's the
>> > > > > > furthest I got with lazy-mining the git history. :)
>> > > > >
>> > > > > There's a comment in fs/ext4/super.c:ext4_nfs_get_inode
>> > > > >
>> > > > > * Currently we don't know the generation for parent directory, so
>> > > > > * a generation of 0 means "accept any"
>> > > > >
>> > > > > But I don't see that used.
>> > > > >
>> > > > > It was used once upon a time; I see it actually used in old 2.5 code in
>> > > > > nfsd_get_dentry. Hm.
>> > > >
>> > > > Oh, maybe it's here in fs/libfs.c:generic_fh_to_parent:
>> > > >
>> > > > switch (fh_type) {
>> > > > case FILEID_INO32_GEN_PARENT:
>> > > > inode = get_inode(sb, fid->i32.parent_ino,
>> > > > (fh_len > 3 ? fid->i32.parent_gen : 0));
>> > > > break;
>> > > > }
>> > > >
>> > > > I'm not sure under what conditions that filehandle encoding is used.
>> > >
>> > > The best guess I can come up with is the old nfs_fhbase_old style handles,
>> > > which (afaict) do not carry parent i_generation?
>> >
>> > Yeah, I just couldn't tell in the time I looked whether they could still
>> > be handed out.
>> >
>> > If not, then the only way they'd still be used is if a client had a
>> > server continually mounted while the server was upgraded from a kernel
>> > that still handed out the old filehandle.
>> >
>> > So if they haven't been given out for long enough it's possible nobody
>> > would notice if we dropped support.
>> >
>> > But, I didn't get far enough to figure that out.
>>
>> Hmm, so looking back through prehistory, Linux prior to 2.3.51 (11 March
>> 2000) gave out the old dentry style fhandles. After that, the kernel
>> only gave out the new style handles that we still use today. In 2.4.6
>> (4 July 2001) the behavior was modified again to chain handle types,
>> i.e. if the client passed in an old style handle then it would get
>> another old style handle back. The changelog for -pre9 says that this
>> was done for compatibility reasons.
>
> Yeah, you're supposed to be able to reboot your NFS server for a kernel
> upgrade without your client applications experiencing anything worse
> than a temporary hang while you wait for the server to come back up.
> So, changing the filehandle format and returning ESTALE to everyone
> would be unpopular.
>
>> So, what's the probability that there are clients out there that started
>> talking to a 2.2-based knfsd and will now want to talk to a modern 4.13
>> kernel seventeen years later?
>
> I think it's unlikely enough that we could drop that code; cc'ing Neil
> in case we overlooked anything.

While I remain a fan of maintaining forward/backward compatibility as
much as possible, 15 years is probably more than I can realistically
hope for.
As you say, a generation number of '0' is only special when old-style
file handles are used, with the "subtree_check" export option. They are
unlikely to have been used recently.

However, I note that include/linux/exportfs.h says:

/*
* 32bit inode number, 32 bit generation number,
* 32 bit parent directory inode number.
*/
FILEID_INO32_GEN_PARENT = 2,

This could be seen as misleading.
Some code that reports that fid_type includes the directory generation
number. Some other code (cephfs, squashfs) doesn't even include the
generation number for the inode (which is OK for squashfs as it is
write-only). I could find no code that matches this documentation.

I was never a fan of having generic fid_types. Except for 0 and 255,
these numbers are generated and interpreted by individual filesystems,
and there is no reason that they should agree on the interpretation (and
as we see here, they don't).

But for the main point of your question: I see no problem with removing
nfs_fhbase_old and related code, and that includes the special handling
of generation number zero.

Thanks,
NeilBrown

>
>> (Do nfs handles persist across client restarts/remounts?)
>
> No.
>
> (Well, with maybe a couple exceptions (fscache and persistent NFSv4
> delegations) but neither seem relevant here.)
>
> --b.

Attachment: signature.asc
Description: PGP signature