Re: [PATCH] isofs: fix inode leak caused by disconnected dentries from exportfs

From: Deepanshu Kartikey
Date: Wed Oct 01 2025 - 16:27:33 EST


Hi Jan,

Thank you for the review. You're absolutely right - my initial explanation was incorrect. I've done extensive debugging to understand the actual mechanism causing the leak.

Root Cause Analysis
===================

The leak occurs specifically with CONFIG_JOLIET=y through the following sequence:

1. Joliet Root Switching During Mount
--------------------------------------

In isofs_fill_super(), when Joliet extensions are detected:
- Primary root inode 1792 is created with i_count=1, i_nlink=3
- During Joliet switching, iput(inode) is called on inode 1792
- i_count decrements to 0, but generic_drop_inode() returns false (i_nlink=3 > 0)
- Inode 1792 remains cached at i_count=0
- New Joliet root inode 1920 is created and attached to sb->s_root

Debugging output:
[9.653617] isofs: switching roots, about to iput ino=1792, i_count=1
[9.653676] isofs: after iput, getting new root
[9.653880] isofs: old inode after iput ino=1792, i_count=0, i_nlink=3
[9.654219] isofs: got new root ino=1920, i_count=1

2. open_by_handle_at() Triggers Reconnection
---------------------------------------------

When the system call attempts to resolve a file handle:
- exportfs_decode_fh_raw() calls fh_to_dentry() which returns inode 1856
- The dentry is marked DCACHE_DISCONNECTED
- reconnect_path() is invoked to connect the path to root
- This calls reconnect_one() to walk up the directory tree

3. Reference Accumulation in reconnect_one()
---------------------------------------------

I instrumented reconnect_one() to track dentry reference counts:

[8.010398] reconnect_one: called for inode 1856
[8.010735] isofs: __isofs_iget got inode 1792, i_count=1
[8.011041] After fh_to_parent: d_count=1
[8.011319] After exportfs_get_name: d_count=2
[8.011769] After lookup_one_unlocked: d_count=3

The parent dentry (inode 1792) accumulates 3 references:
1. Initial reference from fh_to_parent()
2. Additional reference taken by exportfs_get_name()
3. Another reference taken by lookup_one_unlocked()

Then lookup_one_unlocked() creates a dentry for inode 1807:
[8.015179] isofs: __isofs_iget got inode 1807, i_count=1
[8.016169] lookup returns tmp (inode 1807), d_count=1

The code enters the tmp != dentry branch and calls dput(tmp), then goes to
out_reconnected.

4. Insufficient Cleanup
-----------------------

For inode 1807, I traced through dput():
[10.083359] fast_dput: lockref_put_return returned 0
[10.083699] fast_dput: RETAINING dentry for inode 1807, d_flags=0x240043

The dentry refcount goes to 0, but retain_dentry() returns true because of
the DCACHE_REFERENCED flag (0x40 in 0x240043). The dentry is kept in cache
with refcount 0, still holding the inode reference.

For inode 1792:
[10.084125] fast_dput: lockref_put_return returned 2

At out_reconnected, only one dput(parent) is called, decrementing from 3 to 2.
Two references remain leaked.

5. Unmount Failure
------------------

At unmount time:
- shrink_dcache_for_umount() doesn't evict dentries with positive refcounts (1792)
- Doesn't aggressively evict retained dentries with refcount 0 (1807)
- Both inodes appear as leaked with i_count=1

[10.155385] LEAKED INODE: ino=1807, i_count=1, i_state=0x0, i_nlink=1
[10.155604] LEAKED INODE: ino=1792, i_count=1, i_state=0x0, i_nlink=1

Why shrink_dcache_sb() Works
=============================

Calling shrink_dcache_sb() in isofs_put_super() forces eviction of:
- Dentries with extra references that weren't properly released
- Retained dentries sitting in cache at refcount 0

This ensures cleanup happens before the superblock is destroyed.

Open Questions
==============

1. Are exportfs_get_name() and lookup_one_unlocked() supposed to take
references to the parent dentry that the caller must release? Should
reconnect_one() be calling dput(parent) multiple times, or are these
functions leaking references?

2. Is adding shrink_dcache_sb() in put_super() the appropriate fix, or
should this be handled in the reconnection error path when
reconnect_one() fails?

3. Does this indicate a broader issue with how exportfs handles parent
dentry references during failed path reconnections that might affect
other filesystems?

I can investigate further into the implementation of exportfs_get_name()
and lookup_one_unlocked() to understand where exactly the extra references
are taken if that would be helpful.

Best regards,
Deepanshu Kartikey