Re: [RFC PATCH] ovl: keep merged and impure readdir caches separate

From: Nirmoy Das

Date: Tue May 12 2026 - 14:30:20 EST


Hi Amir,

On 11.05.26 23:54, Amir Goldstein wrote:
Hi Nirmoy,

Thanks for the patch.

On Mon, May 11, 2026 at 8:21 AM Nirmoy Das <nirmoyd@xxxxxxxxxx> wrote:
Overlayfs uses one inode cache slot for two readdir cache users with
different lifetime rules. Merged directory iteration pins the cache from
open directory files with cache->refcount. Impure real-directory iteration
uses the inode cache as an unrefcounted lookup table.

Those caches cannot be reused interchangeably. If merged iteration finds
an impure cache in the inode slot, it can pin and seek through a cache
that was not built for merged iteration. If impure iteration finds a merged
cache, it can walk an object whose lifetime is controlled by open directory
files. Either direction can leave ovl_iterate() using stale cache entries.

Add ovl_dir_cache_drop() to detach the inode cache before freeing it. Keep
refcounted merged caches alive until ovl_cache_put(), stop publishing new
merged caches through the inode slot,
This is unacceptable - invalidating merged dir cache to paper over
another root bug, which was not really fixed.

and let impure iteration reuse only
unrefcounted caches.
All those guards are nice, but how does a directory inode change from
being merged to impure or vice versa?
This should never happen.

It took a lot of arguing with Sonnet about wrong leads to find the real bug.

Real bug was introduced by:
b79e05aaa1667 ("ovl: no direct iteration for dir with origin xattr")

It changed the test from:

od->is_real = !OVL_TYPE_MERGE(type);
od->is_upper = OVL_TYPE_UPPER(type);

to:

od->is_real = !ovl_test_flag(OVL_WHITEOUTS, d_inode(dir));
od->is_upper = OVL_TYPE_UPPER(type);

But there is a race window in copy up of a directory where
upper is set and published before the OVL_WHITEOUTS flag
is set.

an opendir() observing this state inside the race window will
wrongly start to iterate_real and another opendir later will
observe the flag and start iterate_merged - boom!

Attached is what I think is a correct fix.
WDYT?

This looks correct with my limited understanding of the overlay code. I still see the issue when running with

the syzkaller-derived C reproducer in a loop inside an arm64 virtme/qemu VM with a KASAN/debug kernel.


Let me try to debug it more and get back.

Regards,

Nirmoy


Thanks,
Amir.