Re: swap storage alignment and stride size

From: Ric Wheeler
Date: Wed Dec 08 2010 - 14:56:36 EST


On 12/08/2010 12:03 PM, Christian Brandt wrote:
Preamble:

Hi fellow linux tamers, the following question has bounced around for
some days in local lists and newsgroups without conclussion and was
escalated upstream several times, here we are...

We are discussing semi-professional storage systems, e.g. ext4 on luks
on lvm on raid on gpt-partitions on 4k sector harddrives or 512k sector
SSDs. Usually every level profits a lot from aligning the data to the
underlying sector/stride/chunk size, e.g. ext4 with a 128k stripe size
will run a lot better on a well aligned 64k stride raid5.

In other words, partition tables, LVM, RAID, luks and filesystems know
how to handle and profit from aligned larger chunks.

In detail:

As far as we can read mm/swapfile.c linux is only concerned about cpu
page size and does not know anything about underlying
chunk/sector/stride sizes and alignment.

Therefore we think every small 1/2/4/8kiB page-sized write access leads
to a read-modify-write cycle for the whole chunk, taking more then twice
as long than simply writing the whole chunk at once.

Questions:

Is this the right place to ask?

Does or could linux swapping make use of aligning chunks?

And if, how?

If not, would it be an improvement?

Will this effect be mostly compensated by the block elevator?

Does it make any sense to change the mkswap page size to the chunk size?
We think those are two totally different beasts and should be left
seperated.

Is Linux already aware of chunk sizes within swap?

How to set up and controlled by the administrator?


Hi Christian,

There has been a lot of work on alignment, Martin Petersen lead most of that and is probably the best one to ping.

Ric


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/