Re: fs/ntfs3: Runtree implementation with rbtree or others

From: Konstantin Komarov
Date: Wed Sep 15 2021 - 11:44:54 EST




On 10.09.2021 21:12, Kari Argillander wrote:
> Hello.
>
> Konstantin you have wrote in ntfs_fs.h in struct runs_tree:
>
> /* TODO: Use rb tree instead of array. */
> struct runs_tree {
> struct rb_root root;
>
> struct ntfs_run *runs;
> size_t count; /* Currently used size a ntfs_run storage. */
> size_t allocated; /* Currently allocated ntfs_run storage size. */
> };
>
>
> But right now it is not array. It is just memory. Probably some early
> comment, but I check that little bit and I think rb tree may not be good
> choice. Right now we allocate more memory with kvmalloc() and then make
> space for one entry with memmove. I do not quite understand why cannot
> memory be other way around. This way we do not memmove. We can just put
> new entry to other end right?
>
> Also one thing what comes to my mind is to allocate page at the time. Is
> there any drawbacks? If we do this with rb_tree we get many small entrys
> and it also seems to problem. Ntfs-3g allocate 4kiB at the time. But
> they still reallocate which I think is avoidable.
>
> Also one nice trick with merging two run_tree togethor would be not to
> allocate new memory for it but just use pointer to other list. This way
> we can have big run_tree but it is in multi page. No need to reallocate
> with this strategy.
>
> I just want some thoughts about this before starting implementation. If
> you think rb_tree would be right call then I can do that. It just seems
> to me that it might not be. But if search speed is big factor then it
> might be. I just do not yet understand enogh that I can fully understand
> benefits and drawbacks.
>
> Argillander
>

Hello.

Rb tree is used in ext4 in similar use case (see extent_status in
fs/ext4/extents_status.h and fs/ext4/extents_status.c).
But ntfs3 use relatively small number of elements.
Tests on fragmented volume showed < 64000 elements in array.
So rb tree probably won't give big benefit. It can even consume more memory.
It is difficult to predict, only comparison between current
implementation and rb tree will answer question "what is better?".
That's why it's not urgent TODO.

Konstantin