Re: [PATCH v2] fuse: replace passthrough backing-id IDR with a counter and hashtable
From: Amir Goldstein
Date: Fri Jun 26 2026 - 10:38:50 EST
On Fri, Jun 26, 2026 at 2:51 PM Mahad Ibrahim
<mahad.ibrahim.dev@xxxxxxxxx> wrote:
>
> The passthrough backing-id IDR carries a /* FIXME: xarray might be space
> inefficient */ on its idr_alloc_cyclic() call. Replace it with a cyclic
> counter for allocation and a hashtable for lookup.
>
> The hashtable keys are already the set of live ids, so allocation needs
> only a counter: it takes the next value, skipping any already present,
> all under fc->lock. A live-id count makes it return -ENOSPC when full.
>
> The hashtable is allocated lazily when passthrough is negotiated, so it is
> present whenever passthrough is enabled; every dereference is already gated
> on fc->passthrough, so no NULL checks are added. On allocation failure
> passthrough is left off and the mount continues.
>
> Lookup is now O(chain) rather than the IDR's bounded walk, but runs only at
> backing setup, off the read/write path.
>
> Testing
>
> Tested under KASAN and KCSAN in separate builds, since the two cannot be
> enabled together. The workload was passthrough_hp under fio with parallel
> open/close churn; ftrace confirmed concurrent allocation, lookup, and
> removal across CPUs. No KASAN use-after-free and no KCSAN data race in
> fs/fuse/backing.c.
>
> Signed-off-by: Mahad Ibrahim <mahad.ibrahim.dev@xxxxxxxxx>
>
> ---
>
> v2:
> - drop the IDA; allocate with a cyclic counter and resolve collisions
> via the hashtable's membership check under fc->lock (Miklos)
> - add a live-id count to bound allocation and return -ENOSPC when full
>
> Uniqueness now comes from the hashtable rather than an allocator that owns
> the id space, so this should stay clear of any future 64-bit or
> user-assigned-id work. The id is kept 32-bit and kernel-allocated here.
>
> Lookup uses 256 fixed buckets, so chains lengthen at very high backing
> counts; an rhashtable would keep it flat but adds complexity. The fixed
> table seemed simpler for the expected counts - happy to switch if you'd
> prefer.
>
> fs/fuse/backing.c | 98 +++++++++++++++++++++++++++++++++++++----------
> fs/fuse/fuse_i.h | 31 +++++++++++++--
> fs/fuse/inode.c | 8 ++--
> 3 files changed, 111 insertions(+), 26 deletions(-)
>
> diff --git a/fs/fuse/backing.c b/fs/fuse/backing.c
> index 472b6afa7dff..77f4d4b00bf4 100644
> --- a/fs/fuse/backing.c
> +++ b/fs/fuse/backing.c
> @@ -35,49 +35,99 @@ void fuse_backing_put(struct fuse_backing *fb)
>
> void fuse_backing_files_init(struct fuse_conn *fc)
> {
> - idr_init(&fc->backing_files_map);
> + fc->backing_ids_count = 0;
> + fc->backing_files_next_id = 1;
looks like your first alloc will be 2.
> + fc->backing_htable = NULL;
> }
>
> static int fuse_backing_id_alloc(struct fuse_conn *fc, struct fuse_backing *fb)
> {
> int id;
> + struct fuse_backing *iterator;
>
> - idr_preload(GFP_KERNEL);
> spin_lock(&fc->lock);
> - /* FIXME: xarray might be space inefficient */
> - id = idr_alloc_cyclic(&fc->backing_files_map, fb, 1, 0, GFP_ATOMIC);
> +
> + if (fc->backing_ids_count == INT_MAX) {
> + id = -ENOSPC;
> + goto out;
> + }
> +
> +retry:
> + id = fc->backing_files_next_id;
> + fc->backing_files_next_id = (id == INT_MAX) ? 1 : id + 1;
> +
> + hash_for_each_possible(fc->backing_htable->backing_files_ht,
> + iterator, node, id) {
> + if (iterator->id == id)
> + goto retry;
> + }
The hash table change is good but this allocator I really hate.
I know that Miklos proposed this, but the way it is implemented can
cause soft lockup and in general if you are implementing an allocator
you are probably doing it wrong.
Also I am concerned that with Joanne's backing inode patches,
backing id maps may be a lot more dense and backing id's a lot more
long lived than with backing id's for open/close lifecycle.
I think it was a mistake to provide a kernel allocator for backing ids -
the server is responsible for every other aspect of managing backing files.
Can we still salvage this UAPI?
Maybe:
1. Do not goto retry, instead return -EAGAIN let userspace retry
2. fuse_passthrough_open() can implement the retry loop
3. At least until INT_MAX, EAGAIN is not expected and for existing servers
with short file open/close lifecycles, EAGAIN should be rare
4. Server/libfuse know exactly which backing ids are allocated,
so they can use whatever algorithm they like for efficient allocation
5. All we need to do is support server allocated backing id
and check it against the hash table lookup
+#define FUSE_BACKING_FLAG_SERVER_ID 1
+
struct fuse_backing_map {
int32_t fd;
uint32_t flags;
- uint64_t padding;
+ uint64_t id;
};
It should be a very minimal addition to implement this.
BTW, we make it 64bit for future extensions (such as backing stripe)
although OPEN response still supports only 32bit backing ids.
Will this cause regressions?
Maybe, but I think that the existing users of fuse passthrough are
still pretty close to upstream and not yet in distributed servers (?)
so I hope this will be fine - if not, there are solutions.
Miklos, do you agree?
> +
> + fb->id = id;
> + hash_add_rcu(fc->backing_htable->backing_files_ht, &fb->node, id);
> + fc->backing_ids_count++;
> +
> +out:
> spin_unlock(&fc->lock);
> - idr_preload_end();
>
> - WARN_ON_ONCE(id == 0);
> return id;
> }
>
> +int fuse_backing_htable_alloc(struct fuse_conn *fc)
> +{
> + struct fuse_backing_htable *ht;
> +
> + ht = kzalloc_obj(struct fuse_backing_htable);
> + if (!ht)
> + return -ENOMEM;
> +
> + hash_init(ht->backing_files_ht);
> + fc->backing_htable = ht;
> +
> + return 0;
> +}
> +
> static struct fuse_backing *fuse_backing_id_remove(struct fuse_conn *fc,
> int id)
> {
> - struct fuse_backing *fb;
> + struct fuse_backing *iterator;
> + struct fuse_backing *fb = NULL;
>
> spin_lock(&fc->lock);
> - fb = idr_remove(&fc->backing_files_map, id);
> + hash_for_each_possible(fc->backing_htable->backing_files_ht,
> + iterator, node, id) {
> + if (iterator->id == id) {
> + hash_del_rcu(&iterator->node);
> + fb = iterator;
> + break;
This fb may very likely be alive and referenced by an open
file, so the very least is to set fb->id = 0, to make it anonymous,
so debug prints won't show different fb's with the same id.
> + }
> + }
> +
> + if (fb)
> + fc->backing_ids_count--;
> +
> spin_unlock(&fc->lock);
>
> return fb;
> }
>
> -static int fuse_backing_id_free(int id, void *p, void *data)
> +void fuse_backing_files_free(struct fuse_conn *fc)
> {
> - struct fuse_backing *fb = p;
> + struct fuse_backing *fb;
> + struct hlist_node *tmp;
> + int bkt;
>
> - WARN_ON_ONCE(refcount_read(&fb->count) != 1);
> - fuse_backing_free(fb);
> - return 0;
> -}
> + if (!fc->backing_htable)
> + return;
>
> -void fuse_backing_files_free(struct fuse_conn *fc)
> -{
> - idr_for_each(&fc->backing_files_map, fuse_backing_id_free, NULL);
> - idr_destroy(&fc->backing_files_map);
> + hash_for_each_safe(fc->backing_htable->backing_files_ht,
> + bkt, tmp, fb, node) {
> + hash_del_rcu(&fb->node);
> + WARN_ON_ONCE(refcount_read(&fb->count) != 1);
> + fuse_backing_free(fb);
> + }
> +
> + kfree(fc->backing_htable);
> + fc->backing_htable = NULL;
> }
>
> int fuse_backing_open(struct fuse_conn *fc, struct fuse_backing_map *map)
> @@ -169,10 +219,18 @@ int fuse_backing_close(struct fuse_conn *fc, int backing_id)
>
> struct fuse_backing *fuse_backing_lookup(struct fuse_conn *fc, int backing_id)
> {
> - struct fuse_backing *fb;
> + struct fuse_backing *iterator;
> + struct fuse_backing *fb = NULL;
>
> rcu_read_lock();
> - fb = idr_find(&fc->backing_files_map, backing_id);
> + hash_for_each_possible_rcu(fc->backing_htable->backing_files_ht,
> + iterator, node, backing_id) {
> + if (iterator->id == backing_id) {
> + fb = iterator;
> + break;
> + }
> + }
> +
> fb = fuse_backing_get(fb);
> rcu_read_unlock();
>
> diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
> index 85f738c53122..db4c07a496f0 100644
> --- a/fs/fuse/fuse_i.h
> +++ b/fs/fuse/fuse_i.h
> @@ -30,6 +30,7 @@
> #include <linux/pid_namespace.h>
> #include <linux/refcount.h>
> #include <linux/user_namespace.h>
> +#include <linux/hashtable.h>
>
> /** Default max number of pages that can be used in a single read request */
> #define FUSE_DEFAULT_MAX_PAGES_PER_REQ 32
> @@ -94,6 +95,10 @@ struct fuse_backing {
> /* refcount */
> refcount_t count;
> struct rcu_head rcu;
> +
> + /* ID Allocation */
> + int id;
move this to the hole next to count.
Mahad,
I am guessing that you are a student doing a summer project?
and you picked up on the FIXME comment?
It's fine, no complaints about that, but TBH, I don't know that there
is huge value to real life users
from making this change, because the measurements were not done on
real life workloads.
However, I do know that there are real users who would benefit from moving the
allocator to userspace, so by slightly changing your plan you could also help
real users and learn how real upstream development works.
Thanks,
Amir.