Re:Re: [PATCH] eCryptfs: truncate optimization (sometimes upto 25000x fa ster)
From: Li Wang
Date: Fri Jan 20 2012 - 06:12:48 EST
Hi Tyler,
Thanks for your comments. I agree with that the plaintext inode size
is __fragile__, because its consistence can hardly be protected by the lower
file system, even the journal is supported. Since the eCryptfs
metadata are just normal file data seen from lower file system,
the lower file system is not easy to treat several data pages (include
eCryptfs metadata update and file data write) together as a journal
transaction.
However, I am not quite convinced by the performance argue. For current
implementation, it incurs too heavy startup cost. Some applications,
for example, as far as I know, for the older version Samba server (I am not quite
sure about the latest version), when exporting eCryptfs plain text folder by Samba
to the Windows client, if you upload a big file through the Windows Samba client,
the Samba server will first truncate to generate an empty file, then start write.
It just costs too much time to create that big file, user as well as Samba totally
does not know what is going on. Sometimes Samba even gives up the upload
because of too much time of waiting, which is incorrectly treated as a network
connection timeout. With this truncate optimization, the cost is averaged, the file
is expanded on-demand, the user experience is improved (at least, both user and
Samba knows that the write is going on instead of no response from kernel at all).
---------- Origin message ----------
>From:"Tyler Hicks" <tyhicks@xxxxxxxxxxxxx>
>To:"Li Wang" <liwang@xxxxxxxxxxx>
>Subject:Re: [PATCH] eCryptfs: truncate optimization (sometimes upto 25000x fa ster)
>Date:2012-01-19 23:26:55
On 2012-01-19 17:19:20, Li Wang wrote:
> Hi,
> Many modern disk-based file systems, such as ext4,
> have the truncate optimization feature, kind of delayed allocation, that is,
> when using 'truncate' to produce a big empty file, the file system will not
> allocate disk space until real data are written into, as a result,
> the execution of truncate is very fast and the disk space are saved.
> However, for eCryptfs, it will actually create a equal-size file, and write zeroes into it,
> which results in the allocation of disk space and slow disk write operations.
> Since eCryptfs does not record hole information on disk, therefore,
> when read out a page of zeroes, eCryptfs can not distinguish actual data
> (encrypted data happened to be whole zeroes) from hole,
> therefore, eCryptfs can not rely on the lower file system specific truncate implementation.
> However, there is one thing eCryptfs can do is that eCryptfs does record file size itself
> on the disk, so that it could be aware of the hole at the end of the file.
> The natural optimization is, while truncate to expand a file to exceed the original size
> (which occurs in many cases while doing truncate),
> record the actual file size (after expansion) in the eCryptfs metadata,
> keep the original size unchanged in the lower file system.
> When reading, if the file size seen from eCryptfs is bigger than from N嫥叉靣笡y氊b瞂千v豝?藓{.n?壏{睉赙zXФ洝塄}财爖?j:+v墾?珣赙zZ+€?zf"穐殘啳嗃i?鄗?畐ア?櫒璀??撷f旟^j谦y呩@A玜囤?0鹅h?鍜i