Re: [PATCH RESEND v5] fat: editions to support fat_fallocate

From: Namjae Jeon
Date: Thu May 02 2013 - 05:15:27 EST


2013/5/2, OGAWA Hirofumi <hirofumi@xxxxxxxxxxxxxxxxxx>:
> OGAWA Hirofumi <hirofumi@xxxxxxxxxxxxxxxxxx> writes:
>
>> Namjae Jeon <linkinjeon@xxxxxxxxx> writes:
>>
>>>> Then, per-file discard fallocate space sounds like wrong. fallocate
>>>> space probably is inode attribute.
>>> Since, our preallocation will not be persistent after umount. So, we
>>> need to free up the space at some point.
>>> If we consider for normal pre-allocation in ext4, in that case also
>>> the blocks are removed in ext4_release_file when the last writer
>>> closes the file.
>>>
>>> ext4_release_file()
>>> {
>>> ...
>>> /* if we are the last writer on the inode, drop the block reservation */
>>> if ((filp->f_mode & FMODE_WRITE) &&
>>> (atomic_read(&inode->i_writecount) == 1) &&
>>> !EXT4_I(inode)->i_reserved_data_blocks)
>>> {
>>> down_write(&EXT4_I(inode)->i_data_sem);
>>> ext4_discard_preallocations(inode);
>>> up_write(&EXT4_I(inode)->i_data_sem);
>>> }
>>>
>>> So, we will need to have this per file . May be the condition for
>>> checking is wrong which can be correct but the correctness points
>>> should be same. We can give a thought on using "i_writecount" for
>>> controlling the parallel write in FAT also.
>>> how do you think ?
>>
>> AFAIK, preallocation != fallocate. ext*'s preallocation was there at
>> before fallocation to optimize block allocation for user data blocks.
yes, this is correct , preallocation!= fallocate, we just adopted only
the "release part" from that approach
Sorry, Would you mind to adopt this approach :) ?

>>
>>>>>> I know. Question is, why do we need to initialize twice.
>>>>>>
>>>>>> 1) zeroed for uninitialized area, 2) then copy user data area. We
>>>>>> need
>>>>>> only either, right? This seems to be doing both for all fallocated
>>>>>> area.
>>>>> We did not initialize twice. We are using the âposâ as the attribute
>>>>> to define zeroing length in case of pre-allocation.
>>>>> Zeroing out occurs till the âposâ while actual write occur after
>>>>> âposâ.
>>>>> If we file size is 100KB and we pre-allocated till 1MB. Next if we try
>>>>> to write at 500KB,
>>>>> Then zeroing out will occur only for 100KB->500KB, after that there
>>>>> will be normal write. There is no duplication for the same space.
>>>>
>>>> Ah. Then write_begin() really initialize after i_size until page cache
>>>> boudary for append write? I wonder if this patch works correctly for
>>>> mmap.
>>> Since you already provided me review comments to check truncate and
>>> mmap, we checked all points for those cases.
>>
>> cluster size == 512b
>>
>> 1) create new file
>> 2) fallocate 100MB
>> 3) write(2) data for each 512b
>>
>> With this, write_begin() will be called for each 512b data. When we
>> allocates new page for this file, write_begin() writes data 0-512. Then,
>> we have to initialize 512-4096 by zero. Because mmap read maps 0-4096,
>> even if i_size == 512.
>>
>> Who is initializing area for 512-4096?
>
> From other view, I guess fat_zero_falloc_area() is for filling zero for
> 0-10000, in the following case?
>
> 1) create new file
> 2) lseek(10000)
> 3) write data by write(2)
>
> This job is for cont_write_begin(). If example is correct, why
> cont_write_begin() doesn't work? I guess, because get_block() doesn't
> set buffer_new() for those area.
>
> If above is correct, right implement to change get_block().
We will check your case.

Thanks~
>
> Thanks.
> --
> OGAWA Hirofumi <hirofumi@xxxxxxxxxxxxxxxxxx>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/