Re: Extensions to HFS filesystem

csmith@stoneboro.uucp.cirr.com
Tue, 14 May 96 04:15 CDT


In article <318D6B29.167EB0E7@sccm.stanford.edu>
"Paul H. Hargrove" <hargrove@sccm.stanford.edu> writes:

There is only one problem that I see remaining and that is the task of
keeping an accurate count of the number of links to these files. Counting
links that are created and destroyed under Linux is no problem. However
if the last is deleted under MacOS then the file continues to use disk
space until some sort of garbage collection is done (in user space I
hope). The fact is, however, that by representing a hard link as a
pseudo-symlink to a hidden file means that a copy of the link under MacOS
would create an extra link. This means that the kernel can't know when it
is deleting the last link unless the kernel actually does the garbage
collection (i.e. link counting) when the fs is mounted. I really don't
want to do the counting in the kernel.

I think it would work to have the link info contain a self pointer, so
that it records something like the CNID of both the link (pointer) and
the hidden file (pointee).

I don't know if that makes sense or not since I don't know anything about
HFS but what I mean is that the symbolic link that implements the hard
link can be of the form "12345->/hfs/.hidden/actualfile" where it gives
both the "inode number" of the link and the name (or whatever) of the
real file. Copying such a link under MacOS produces a copy, but then
Linux can know not to believe it since the "inode number" of the copy
will not match the "inode number" recorded in the copied link.

One option I may explore is have fsck.hfs do all the link counting and
any other post-MacOS-use cleanups, and extend the definition of "dirty"
to include use of the fs by MacOS since the last use by Linux. Such a
fsck would remove ALL the links if the luser was dumb enough to delete
the hidden file under MacOS.

Alternatively, instead of storing a link count (with the hidden file)
you could store a list of the links, using their "inode numbers".
When a hard link is deleted you have to examine the list to see how
many other links remain valid. With the back pointers, when the list
gets down to one link you can turn it back into a regular file.

It's only necessary to validate the link list once per boot, you could
store a boot sequence number with the list to remember which ones have
been checked already. Important for, say, expiring netnews.

This hack subsumes the previous hack -- the links don't have to contain
their addresses if the list does.

Under my original proposal every hardlink, symlink and device would use
the HFS equivalent of one "cluster", which in the worst case is 1/32767
of the disk and 1/65535 in the best case. If you have a 1Gb HFS fs then
devices and links would take 16k each, as under UMSDOS. I don't like this
idea very much so I've been thinking of an alternative. If all the info
such as device numbers and "link targets" were stored in a single database
then links and devices would be empty files and would only use the disk
space required for their directory entries and the space in the database.

Sounds more robust to me. But it's a whole inner filesystem to implement,
a lot more work. Then again, requiring a fsck every time you boot back and
forth is well worth avoiding.