Re: [PATCH v6 1/3] fuse: add compound command to combine multiple requests

From: Miklos Szeredi

Date: Fri Feb 27 2026 - 04:50:05 EST

On Thu, 26 Feb 2026 at 17:43, Horst Birthelmer <horst@xxxxxxxxxxxxxx> wrote:

> +int fuse_compound_add(struct fuse_compound_req *compound,
> + struct fuse_args *args,
> + int (*converter)(struct fuse_compound_req *compound,
> + unsigned int index))
> +{
> + if (!compound || compound->count >= compound->max_count)
> + return -EINVAL;
> +
> + if (args->in_pages)
> + return -EINVAL;

WARN_ON()

> +
> + compound->op_args[compound->count] = args;

This could be done *much* simpler with lists. Just add a 'struct
list_head list' member to struct fuse_args and pass a 'struct
list_head *compound' to fuse_compound_add(). No need for
fuse_compound_alloc/free().

Alternatively pass a 'void **' to fuse_compound_add(), where the input
args could be copied directly. This has the advantage of not having to
keep the args around, so they could be local to the fill function.
After the request is done the responses would similarly be decoded
into the outargs.

Both approaches have advantages and disadvantages, I don't see a clear winner.

> + compound->op_converters[compound->count] = converter;

What are these converters?

> +ssize_t fuse_compound_send(struct fuse_compound_req *compound)
> +{
> + struct fuse_conn *fc = compound->fm->fc;
> + struct fuse_args args = {
> + .opcode = FUSE_COMPOUND,
> + .in_numargs = 2,
> + .out_numargs = 2,
> + .out_argvar = true,
> + };
> + unsigned int req_count = compound->count;
> + size_t total_expected_out_size = 0;
> + size_t buffer_size = 0;
> + void *resp_payload_buffer;
> + char *buffer_pos;
> + void *buffer = NULL;
> + ssize_t ret;
> + unsigned int i, j;
> +
> + for (i = 0; i < req_count; i++) {
> + struct fuse_args *op_args = compound->op_args[i];
> + size_t needed_size = sizeof(struct fuse_in_header);
> +
> + for (j = 0; j < op_args->in_numargs; j++)
> + needed_size += op_args->in_args[j].size;
> +
> + buffer_size += needed_size;
> +
> + for (j = 0; j < op_args->out_numargs; j++)
> + total_expected_out_size += op_args->out_args[j].size;
> + }
> +
> + buffer = kzalloc(buffer_size, GFP_KERNEL);
> + if (!buffer)
> + return -ENOMEM;
> +
> + buffer_pos = buffer;
> + for (i = 0; i < req_count; i++) {
> + if (compound->op_converters[i]) {
> + ret = compound->op_converters[i](compound, i);
> + if (ret < 0)
> + goto out_free_buffer;
> + }
> +
> + buffer_pos = fuse_compound_build_one_op(fc,
> + compound->op_args[i],
> + buffer_pos, i);
> + }
> +
> + compound->compound_header.result_size = total_expected_out_size;
> +
> + args.in_args[0].size = sizeof(compound->compound_header);
> + args.in_args[0].value = &compound->compound_header;
> + args.in_args[1].size = buffer_size;
> + args.in_args[1].value = buffer;
> +
> + buffer_size = total_expected_out_size +
> + req_count * sizeof(struct fuse_out_header);
> +
> + resp_payload_buffer = kzalloc(buffer_size, GFP_KERNEL);
> + if (!resp_payload_buffer) {
> + ret = -ENOMEM;
> + goto out_free_buffer;
> + }
> +
> + args.out_args[0].size = sizeof(compound->result_header);
> + args.out_args[0].value = &compound->result_header;
> + args.out_args[1].size = buffer_size;
> + args.out_args[1].value = resp_payload_buffer;
> +
> + ret = fuse_simple_request(compound->fm, &args);
> + if (ret < 0)
> + goto fallback_separate;
> +
> + ret = fuse_handle_compound_results(compound, &args);
> + if (ret == 0)
> + goto out;
> +
> +fallback_separate:
> + /* Kernel tries to fallback to separate requests */
> + if (!(compound->compound_header.flags & FUSE_COMPOUND_ATOMIC))
> + ret = fuse_compound_fallback_separate(compound);
> +
> +out:
> + kfree(resp_payload_buffer);
> +out_free_buffer:
> + kfree(buffer);
> + return ret;
> +}

If we go with the list of fuse_args, then all the above logic could go
into the lower layer (dev.c) which already handles fuse_args ->
request -> fuse_args conversion. What's needed is mostly just a loop
that repeats this for all the sub requests.

> +struct fuse_compound_req {
> + struct fuse_mount *fm;
> + struct fuse_compound_in compound_header;
> + struct fuse_compound_out result_header;
> +
> + struct fuse_args **op_args;
> +
> + /*
> + * Every op can add a converter function to construct the ops args from
> + * the already received responses.
> + */
> + int (**op_converters)(struct fuse_compound_req *compound,
> + unsigned int index);
> + int *op_errors;

Can go into fuse_args.

> +
> + unsigned int max_count;
> + unsigned int count;
> +};
> +/*
> + * This is a hint to the fuse server that all requests are complete and it can
> + * use automatic decoding and sequential processing from libfuse.
> + */
> +#define FUSE_COMPOUND_SEPARABLE (1 << 0)

We really need per sub-request flags, not per-compound flags.

I.e:

FUSE_SUB_IS_ENTRY - this sub request will return a new entry on
success (nodeid, filehandle)
FUSE_SUB_DEP_ENTRY - this sub request depends on the result of a previous lookup

> +/*
> + * This will be used by the kernel to continue on
> + * even after one of the requests fail.
> + */
> +#define FUSE_COMPOUND_CONTINUE (1 << 1)

Again, I think it makes no sense to have compound-global flags, since
it might be possible that there are several sub-requests and there are
different dependencies between for each of them.

Also if there are no examples of a certain flag in this patchset, then
it's better to just leave it out and add it together with the actual
user.

> +/*
> + * This flags the compound as atomic, which
> + * means that the operation has to be interpreted
> + * atomically and be directly supported by the fuse server
> + * itself.
> + */
> +#define FUSE_COMPOUND_ATOMIC (1 << 2)

Why would this be needed? The VFS provides the locking that ensures
atomicity, even if the implementation is not atomic. At least for
local filesystems that is always the case.

> +
> +/*
> + * Compound request header
> + *
> + * This header is followed by the fuse requests
> + */
> +struct fuse_compound_in {
> + uint32_t flags; /* Compound flags */

Not needed.

> +
> + /* Total size of all results expected from the fuse server.
> + * This is needed for preallocating the whole result for all
> + * commands in the fuse server.
> + */
> + uint32_t result_size;

Please drop this. I think libfuse can allocate separate buffers for
each sub-request's out arg and hand a vector of these to the transport
layer.

> + uint64_t reserved;

So it turns out the compound header is empty. Not a problem, just
make it contain 'uint64_t reserved[2]' for future use.

> +};
> +
> +/*
> + * Compound response header
> + *
> + * This header is followed by complete fuse responses
> + */
> +struct fuse_compound_out {
> + uint32_t flags; /* Result flags */

What is this for?

> + uint32_t padding;
> + uint64_t reserved;
> +};

Thanks,
Miklos