Re: Implementing Meta File information in Linux

L. Adrian Griffis (adrian@idir.net)
Sat, 12 Sep 1998 10:58:17 -0500


Theodore Y. Ts'o wrote:
>
> Date: Fri, 11 Sep 1998 21:51:15 -0500
> From: "L. Adrian Griffis" <adrian@idir.net>
>
> > Secondly, ACL's are often duplicated; it will
> > be very common for all of the files in a particular source tree (say,
> > everything under /usr/src/linux for example) to have the same ACL.
>
> This is not implausible, but imagine a server at an ISP or a University.
> It might have lots of different accounts with people setting different
> ACL's. Would this approach scale well? Shall we have another standard
> error value like EACLTABLEFULL?
>
> The observation still holds; if you have many different users, each user
> will typically have multiple files and directories, and files within the
> same directory will often have the same ACL's.

I was confused by your argument that the resource fork was inefficient
because it would have to read at least one block before seeing the ACL.
Your approach still requires reading a block and not simply an entry
in a preloaded array of ACL's, but you hope to hit the cache rather
frequently but sharing ACL's between files.

> Take for example a user using MH; each e-mail message stored in an MH
> folder is a separate file, and they are all likely to have the same
> Access Control List. Whether or not a system is single-user or
> multi-user doesn't change the fact that all of the files in the folder
> will likely have the same protections. By storing all ACL's for a
> particular filesystem in a central place, and then assigning each ACL
> with a single number, when the user scans his/her MH folder, the kernel
> can read in the ACL once, and then see that all of the files in the MH
> folder refer to the same ACL, which can be cached in the ACL cache.

You are correct that these files stored by MH are all likely to have
the same ACL; In particular, they will all probably have a blank ACL,
since traditional file permissions are expressive enough to deny
everyone but the owner access to the file. I suspect that, in most
cases where people simply let an ACL default, they will let it default
to an empty list.

In your defence, I should point out that it costs much less to note
that the ACL starting block value is zero in the inode than to read
through a list of frivolous, kernel managed user metadata only to
verify that there are not, in fact, any ACL metadata in the list.
I am still uncomfortable with this one purpose feature, but I have
to admit that your position is not without merit.

> (By the way, Remy Card has implemented an Alpha version of this already,
> and he and I discussed the design before he started implementation. So
> this isn't just idle chatter).

Noted.

> Where in an ext2 inode would you put this ACL number? Can we really
> do this in an ext2 filesystem or would be have to invent a new one?
> If we have to invent a new one, shouldn't we try to get more from
> all the trouble we would have to go through?
>
> Actually, there are two entries in the ext2 inode reserved for the ACL
> numbers. One is only used for directories, and specifies the default
> creation ACL to be used for new files created in that directory, and the
> other specifies the ACL for inode (whether it be a file or a directory).
>
> We had left room in the ext2 filesystem to support ACL's from the very
> beginning, including reserving two inodes which would be used for
> storing the access control list and the access control list entries.

I see this, now. This certainly answers my objection that we are not
buying enough with our change in the inode structure, since there is
no required change to that structure. This was really my principal
objection.

> You won't get a community of non-administrator users to
> go out of their way to tell the OS that it can use a
> common ACL table entry for all the files that really
> have the same list.

The rest of that paragraph was:

> Also, each list has a variable
> length, and users may put the list items in different
> orders, so you will have to normalize the order to
> get the commonality that you want. And then you have
> to index the ACL array (unless you want to do an unindexed
> search through an array of variable length lists, each time
> you change an individual access control list). All this is
> possible, but each additional level of complexity reduces
> any advanges that it has over the resource fork approach.

Can you see how your omission has fundamentally changed the
meaning of this paragraph. The point I was making was not
that it is impractical to share ACLs between files. Rather
I was pointing out that the process of discovering commonality,
so that the kernel can decide to point an inode to an existing
ACL, is potentially expensive.

> That's what the ACL cache is for; it's extremely likely that if the user
> changes the ACL settings on a file, he/she will be setting it to match
> the ACL setting on some other file. This is not guaranteed, but in
> practice it will minimize duplications in the ACL table. Note that a
> very common operation will be changing the access controls for all of
> the files in a single directory, just as it's relatively common for
> users to type "chmod 600 *".

"extremely likely" might be overstating the case, but your "chmod"
example convinces me that there will me many cases where many inodes
will be set to a common value in a short span of time. Certainly
there will be enough cases to make this a significant design
consideration.

> Yes this is not perfect. However, even if there are a few identical
> ACL's stored in the central ACL table, this will still be vastly more
> efficient than storing a separate ACL for every single file in the ext2
> filesystem.

So you are willing to live with just the benifits of cache hits.
Interresting... I agree that this will avoid the most costly
parts of a search for commonality, but still yield some benefits
for the cases where we get a cache hit.

> Also note that each directory has a default file creation acl; when you
> create a new file, all the filesystem has to do is to copy the acl from
> the directory's default creation acl field to the new file's acl field,
> and then bump the ref count on the ACL.

Noted. This will clearly yield additional benefits.

> Where did this second inode come from? The "resource fork" would be
> tracked by the same inode as the data fork. If we have multiple
> hard links in separate directories to a single file, the file's
> inode is the obvious way to get at the metadata. If we put a
> pointer to a different inode for metadata in the main data inode,
> we have modified the inode, so we already have to invent a new
> filesystem type. As long as we are modifying the inode and
> inventing a new filesystem type, why not just put the "resource
> fork" in the inode, and make that our change. For those files
> that need metadata, we would still have to read a block; You're
> right about that. But the other approaches have their disadvantages
> too.
>
> For all of the designs which I have outlined, they are all can be
> implemented with only minor extensions to the ext2 filesystem format.
> These changes can be made in a backwards-compatible fashion without
> forcing people to backup, reformat, and restore their filesystems in
> order to enjoy these new features. The ext2 filesystem was carefully
> designed to allow this kind of incremental extension. Perhaps you
> didn't realized this when you made your assertion that any kind of
> change required a new filesystem type. This is in fact not true.
>
> Because people generally like stability, I made the assumption that we
> were discussion how we might add support for the "resource fork" into
> the existing ext2 filesystem, and in that case, if you are going to
> architect a solution where you store arbitrarily large amounts of data
> in a "resource fork", there really isn't any other solution than using a
> second inode to store the resource fork.

You are correct on both counts. This explains to me why you though
a second inode might be called for in my proposal. Without changing
ext2, the existing ACL pointer (which I was unaware of) could be
used to point to another inode. I am inclined toward the same
objections you outlined to this approach.

> Remember that both the
> resource fork and the main data fork can get extended at any time during
> the life of the file. So you need a separate set of direct and indirect
> block pointers for the resource fork and the main data fork of the file;
> there isn't enough space in the inode to store a second set of block
> pointers.
>
> You could use the linked-list approach for the resource fork --- that's
> why I had planned to store the kernel metadata --- but such an approach
> doesn't work if you have to store arbitrarily large amounts of data,
> since a linked list is a very fragile filesystem data structure; one
> corrupted block and you won't be able to find the rest of the linked
> list. (Also consider what would happen if the ACL entry gets stored at
> the very end of the several megabyte-long resource fork; the kernel
> would have to search through several megabytes before finally finding
> the ACL in a linked list implementation.)

The metadata could be reordered, but I have my own doubts about
this solutions. I share your objection to a linked list approach
to block allocations.

> The bottom line is that there is room for an extra pointer to store a
> pointer to a block containing kernel-metadata, which in the unlikely
> event where the kernel metadata exceeded one block, could be extended
> via a linked-list arrangement. There is not space for an additional set
> of blocks for a resource fork.

Agreed.

> On the other hand, if you are thinking about creating your own
> filesystem which has support for a new resource fork, go ahead and try!
> You will find that making an efficient and robust filesystem is harder
> than you think.
>
> This is simply incomprehensable. In case you haven't noticed,
> there has been a certain amount of interrest in user metadata.
>
> There is a lot of interest in many things in this world. For example,
> many people have a deadly fascination with crack cocaine. That doesn't
> make the interest correct, or even healthy.

Again, you've omitted the remainder of this paragraph, and you
omitted the statement of yours that I was objecting to.

You said:

> > We also avoid needing to define a non-standard API
> > for storing user metadata, and we avoid tempting
> > application programmers to write non-portable code
> > that won't work on any other Unix system in the world.

I wasn't arguing that there is an interrest so you must
respond to this interrest by writing code. I was responding
to the phrase "we avoid tempting applications programmers to
write non-portable code". My complete response was:

> This is simply incomprehensable. In case you haven't noticed,
> there has been a certain amount of interrest in user metadata.
> There is no widely accepted standard API for user metadata that
> I know of. If there isn't currently a standard API, but people
> want the feature, doesn't it seem to you that there is a need
> for an API? Doesn't this new API need to start somewhere?? Is
> there a metadata API standard that you know of that we should
> be thinking about as we venture down the road to metadata? If
> there is one, we should consider it; If not, this same argument
> would apply wherever we attempt to invent the new API. How can
> we ever have a new API without putting aside this argument?
> Would you rather wait around for Bill Gates to invent the API,
> and then try to implement it on Linux and hold your breakfast
> down at the same time.

You were arguing that we should avoid creating a non-standard API,
and I submit that you yourself do not believe this; You are,
afterall, creating a new facility, these ext2 ACLs, right? Aren't
you going to create a new API for it? Why, in principle, does
your argument make sense for "storing user metadata" but not for
your ACLs?

Theodore, you have important points to make, and I am closer to
agreeing with your position as a result. Don't obscure these
important point with nonsense like this. Don't you agree that
every standard API starts as a new, non-standard API at some
point? How can we take this argument of your seriously without
concluding that we can never invent new APIs?

> In the case of user metadata, I have not yet seen a single instance
> where the use of user metadata conferred any kind of real advantage
> which could not be achieved in some other way. So I see no real
> benefits, and many disadvantages.

I really doubt this. Don't you really mean to say that you
haven't seen big advantages? Surely saying that you see "no
real" advantages can only provoke arguments. Isn't that what
we've all come to expect on this list?

And the point you are about to make is important. Why put
flamebait like this in a supporting position to the
point that follows.

> As a result, it's not something for
> which I will invest my own personal time to do; I think it's a waste of
> time.

Excellent point. This is the essence of how decisions are made
on projects like this. The people who contribute code are the
only ones, in general, who have any decision-making authority.

> If you want to development your own filesystem with this non-stnadard
> extension, and your own non-standard, non-portable API, there is of
> course nothing stopping you from doing that. I wouldn't recommend that
> application writers and desktop designers who are interested in portable
> program use this interface,

What would you recommend that they use instead?

> but obviously it is up to them to decide
> whether they want to use it.

All of my auguments aside, Theodore, it sounds like you are involved
in the actual writing of code, and you have addressed the most
important objection I made to your arguments. I salute your efforts,
and I will be interrested in seeing what you end up with.

-- 
---
L. Adrian Griffis - KE6CSX - adrian@idir.net

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/faq.html