Re: Implementing NVMHCI...

From: Linus Torvalds
Date: Sun Apr 12 2009 - 13:26:42 EST




On Sun, 12 Apr 2009, Robert Hancock wrote:
>
> What about FAT? It supports cluster sizes up to 32K at least (possibly up to
> 256K as well, although somewhat nonstandard), and that works.. We support that
> in Linux, don't we?

Sure.

The thing is, "cluster size" in an FS is totally different from sector
size.

People are missing the point here. You can trivially implement bigger
cluster sizes by just writing multiple sectors. In fact, even just a 4kB
cluster size is actually writing 8 512-byte hardware sectors on all normal
disks.

So you can support big clusters without having big sectors. A 32kB cluster
size in FAT is absolutely trivial to do: it's really purely an allocation
size. So a fat filesystem allocates disk-space in 32kB chunks, but then
when you actually do IO to it, you can still write things 4kB at a time
(or smaller), because once the allocation has been made, you still treat
the disk as a series of smaller blocks.

IOW, when you allocate a new 32kB cluster, you will have to allocate 8
pages to do IO on it (since you'll have to initialize the diskspace), but
you can still literally treat those pages as _individual_ pages, and you
can write them out in any order, and you can free them (and then look them
up) one at a time.

Notice? The cluster size really only ends up being a disk-space allocation
issue, not an issue for actually caching the end result or for the actual
size of the IO.

The hardware sector size is very different. If you have a 32kB hardware
sector size, that implies that _all_ IO has to be done with that
granularity. Now you can no longer treat the eight pages as individual
pages - you _have_ to write them out and read them in as one entity. If
you dirty one page, you effectively dirty them all. You can not drop and
re-allocate pages one at a time any more.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/