Re: [PATCH 3/3] ovl: redirect on rename-dir
From: Amir Goldstein
Date: Fri Nov 11 2016 - 07:42:26 EST
On Fri, Nov 11, 2016 at 12:06 PM, Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
> On Fri, Nov 11, 2016 at 10:46 AM, Konstantin Khlebnikov
> <koct9i@xxxxxxxxx> wrote:
>> On Fri, Nov 11, 2016 at 1:56 AM, Amir Goldstein <amir73il@xxxxxxxxx> wrote:
>>> On Mon, Nov 7, 2016 at 3:38 PM, Amir Goldstein <amir73il@xxxxxxxxx> wrote:
>>>> On Mon, Nov 7, 2016 at 12:08 PM, Konstantin Khlebnikov <koct9i@xxxxxxxxx> wrote:
>>>>> On Mon, Nov 7, 2016 at 1:04 PM, Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
>>>>>> On Mon, Nov 7, 2016 at 10:58 AM, Konstantin Khlebnikov <koct9i@xxxxxxxxx> wrote:
>>>>>>
>>>>>>> I've stumbled on somehow related problem - concurrent copy-ups are
>>>>>>> strictly serialized by rename locks.
>>>>>>> Obviously, file copying could be done in parallel: locks are required
>>>>>>> only for final rename.
>>>>>>> Because of that overlay slower that aufs for some workloads.
>>>>>>
>>>>>> Easy to fix: for each copy up create a separate subdir of "work".
>>>>>> Then the contention is only for the time of creating the subdir, which
>>>>>> is very short.
>>>>>
>>>>> Yeah, but lock_rename() also takes per-sb s_vfs_rename_mutex (kludge by Al Viro)
>>>>> I think proper synchronization for concurrent copy-up (for example
>>>>> round flag on ovl_entry) and locking rename only for rename could be
>>>>> better.
>>>>
>>>> Removing s_vfs_rename_mutex from copy-up path is something I have been
>>>> pondering about.
>>>> Assuming that I understand Al's comment above vfs_rename() correctly,
>>>> the sole purpose of per-sb serialization is to prevent loop creations.
>>>> However, how can one create a loop by moving a non-directory?
>>>> So it looks like at least for the non-dir copy up case, a much finer grained
>>>> lock is in order.
>>>>
>>>
>>>
>>> I posted patches to relax the s_vfs_rename_mutex for copy-up and
>>> whiteout in some use cases.
>>>
>>> Konstantin,
>>>
>>> It would be useful to know if those patches help with your use case.
>>>
>>
>> Well.. I think relaxing only s_vfs_rename_mutex wouldn't help much here.
>> Copying is still serialized by i_mutex on workdir?
>> Data copying should be done without rename locks at all.
>
> We do need something to prevent multiple copy-ups starting up in
> parallel on the same file, though.
>
I guess an inode_lock on the copy-up victim should suffice?
I will look into it as soon as I am done with profiling.
So far I ran only 2 rm -rf threads on 2 different overlay mounts
on the same underlying fs
and s_vfs_rename_mutex was contended about ~4% of the time.
In this test, copy-up is not dominant - only ~2% for the directory
copy-ups, but vfs_whiteouts take 20% and the vfs_rename itself 10%,
both with s_vfs_rename_mutex held.
Amir.