Re: [PATCH 0/7] overlay filesystem: request for inclusion

From: Michal Suchanek
Date: Thu Jun 16 2011 - 06:35:51 EST


On 16 June 2011 04:43, J. R. Okajima <hooanon05@xxxxxxxxxxx> wrote:
>
> Michal Suchanek:
>> This is generally not possible in solutions that don't reserve any filename=
>> s.
>>
>> However, it should be possible to create whiteout of a non-existent
>> entry in a directory while it is locked without affecting userspace.
>
> Actually aufs generates a doubly whiteouted unique name dynamically for
> the target dir. For instance, when rmdir("dirA") aufs does,
> - lock i_mutex of the parent dir of dirA on the real fs
> - some verifycations for the parent-child relationship
> - some tests whether we can do rmdir
> - create whiteout for dirA
> - rename dirA to .wh..wh.XXXXXXXX (random value in hex), after making

Probably swap the two above, you can't make a whiteout in presence of
the directory, right?
Anyway, you could just mark dirA as whiteout and remove any whiteouts
contained in it asynchronously, and only jump through these hoops when
trying to create a new entry in place of non-empty whiteout, or sync
on emptying the old whiteout before making a new entry.

> Âsure the name doesn't exist
> - unlock the parent dir
> - return to VFS
> And then the async workqueue removes the .wh..wh.XXXXXXXX dir with some
> whiteouts under it.
>
> It means the temporary whiteout name is,
> - always unique
> - always hidden (from users), even if it remains accidentally
> So even if an error happens in the async work, it doesn't matter.

Yes, it can only cause pollution with whiteouts unrelated to any files
that ever existed which is not too much of an issue unless people want
to add random stuff to the lower layer and see it in the union when
they reconstruct it again.

>
> Additionally there is a userspace script called "auchk" which is like
> fsck for real fs. auchk script checks the logical consistency on the
> (writable) real fs, and removes the illegal whiteouts, remained
> pseudo-links, and remained temp files.
>
>
>> As an alternative way to perform atomic renames I would suggest
>> "fallthrough symlinks". If you want to rename an entry which is
>
> Symlink?
> Is it a different thing from DCACHE_FALLTHRU in UnionMount?

Yes, the fallthru in unionmount only says "look below here", it cannot
point to a different place in the filesystem.

> I am afraid a special symlink is fragile or dangerous.
> Its special meaning is valid in inner union world only, is it? If

It is only valid when in the upper layer of a union. However, so is
whiteout, and so are files that were visible in the union but are not
visible in the top layer if examined separately, outside of the union.

It must be accepted that the top layer is different from the union,
otherwise you want a copy, not a union.

> something in outer world gets changed, we may not follow the symlink
> anymore or follow something different unexpectedly. Is it acceptable?

That' the whole idea behind symlinks, and also unions which implicitly
link the lower layer into the upper to present the result as a single
directory tree.

Anyway, the motivation behind the "fallthru symlink" is that you need
not copy-up on seemingly trivial operations like rename, touch, etc.
which both makes them more efficient and easier to get atomic. As I
understand it copy-up is the operation that causes the most issues and
with "fallthru symlinks" you need it only for operations that are
expected to modify something non-trivially.

Obviously, this is not so nice for zero sized files but they should be
handled the same way for consistency I guess. Also metada that can be
conveniently recorded on the fallthru entry would make touch fast but
would hide possible later updates to the lower layer so it might be
not good solution for all use cases. For throwaway tmpfs, however, any
optimization counts.

Seriously, the overlayfs documents that it can have opaque directories
but I don't see what they would be used for. There is no way to turn a
directory opaque with normal userspace operation afaict.
It has no explicit fallthrus, at least not documented so to have any
level of consistency it should always check the lower layer because it
can grow some new directories when the union is deconstructed, offline
modified, and reconstructed (which is supported use case according to
the docs).

Thanks

Michal
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/