Re: Things that Longhorn seems to be doing right

From: Timothy Miller
Date: Fri Oct 31 2003 - 11:37:26 EST




Scott Robert Ladd wrote:


Another problem with metadata is that it is largely generated by the user, who is notoriously lazy. A truly powerful system would use contextual analysis and other algorithms to automatically generate metadata, freeing the user from an onerous task (which is what computers should do). Certainly, some search engiens are bordering on this capability.


There is a French company called Pertimm which develops a search engine that does this with documents. It even does cross-language queries based on sophistocated linguistic analysis. Often, I wish google had some of those features, if even a primitive synonym table.

The relevance here, though, is that the Pertimm index is much larger than the actual text that be being indexed. That's not a problem, really, because the same is true for google. You need that for efficient searches. But there is no place for such a thing in a file system. I don't think any Linux developers would want the metadata to even APPROACH the size of the file data, let alone get LARGER.

Indexing of this sort has its place, but applying it to a whole file system is much too broad of a use. For instance, you wouldn't want to index the contents of your binary programs, or even shell scripts for that matter. So text, data, and code need to have different kinds of indexing.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/