Re: [PATCH 07/11] btrfs: move to fn pointers for get/put workspaces

From: Dennis Zhou
Date: Tue Jan 29 2019 - 18:35:15 EST


Hi Nikolay,

On Tue, Jan 29, 2019 at 10:22:34AM +0200, Nikolay Borisov wrote:
>
>
> On 28.01.19 Ð. 23:24 Ñ., Dennis Zhou wrote:
> > The previous patch added generic helpers for get_workspace() and
> > put_workspace(). Now, we can migrate ownership of the workspace_manager
> > to be in the compression type code as the compression code itself
> > doesn't care beyond being able to get a workspace. The init/cleanup
> > and get/put methods are abstracted so each compression algorithm can
> > decide how they want to manage their workspaces.
> >
> > Signed-off-by: Dennis Zhou <dennis@xxxxxxxxxx>

Thanks for reviewing my patches so promptly!

>
> TBH I can't really see the value in this patch. IMO it doesn't make the
> code more readable, on the contrary, you create algorithm-specific
> wrappers over the generic function, where the sole specialization is in
> the arguments passed to the generic functions. You introduce 4 more
> function pointers and this affects performance negatively (albeit can't
> say to what extent) due to spectre mitigations (retpolines).
>

I'll try and address the need for function pointers below, but of the 4,
only 2 are called often as 2 are for init and clenaup.

For the cost of function pointers, I tested with retpoline on, with zlib
both via function pointers and hard coded using the Silesia corpus test
in the cover letter. It didn't show up in perf and the runtimes were
very close to each other with the function pointers actually performing
marginally better. I imagine the actual compression is heavy enough that
it greatly outweighs the cost of a function pointer.

> I also read the follow up patches with the hopes of seeing how the code
> becomes cleaner to no avail. At this point I'm really not in favor of
> this particular patch.

So I guess to recap the whole idea. The original implementation forced
everyone to use workspaces_list. This is a generic implementation that
only allows the management of workspaces by a lru list.

Zstd compression levels introduces the requirement of being able to
distinguish workspaces. This is not possible with the original
workspaces_list implementation. If I modify this, it becomes rather
cumbersome adding variables to the generic workspaces_list that will
only be used by a particular compression types. It also fractures the
code having find_workspace() and free_workspace() deal with changing
code paths based on the compression type.

I think a big strength of this series is that we no longer rely on using
paired arrays and passing around compression type. Once we are working
within a compression type, it is not possible to accidentally cross
boundaries due to an off by one error. Each compression type is siloed
with workspace management code being up to the compression type code and
not the core compression code.

This series works to migrate to a two-level API with get_workspace() and
put_workspace() being the interface that the core compression code
interacts with each compression type. The core compression code does not
and should not care beyond the fact that it is getting a workspace. So,
get_workspace() and put_workspace() deal with managing the workspaces
themselves. Underneath this is alloc_workspace() and free_workspace()
which deal only with creating and deleting a workspace. I think the
division of labor here is neat and allows for code sharing.

Prior to this, no one really needed to do anything smarter with
workspace management and therefore could share a single implementation.
Now, zstd has to do something a little smarter based on the level
requested and has the ability to use higher level workspaces should they
be available. The division of the API let's zstd do just that. It hides
all of the manager complexity away from the core compression code.

I apologize if I'm only restating what I've already said. If the
reasoning is still unclear, I can try and be more specific to those
unaddressed areas.

Thanks,
Dennis