Re: [PATCH] ceph: fix refcount leak in write_folio_nounlock()

From: Viacheslav Dubeyko

Date: Tue Jun 30 2026 - 11:24:42 EST


On Thu, 2026-06-11 at 22:34 +0800, WenTao Liang wrote:
> write_folio_nounlock() unconditionally increments
> fsc->writeback_count before allocating an OSD request.  If an
> early error causes the function to return without queuing an
> active write, the counter is never decremented, leaking a
> reference and making the filesystem appear permanently
> congested.  Three such paths exist:
>
> - ceph_osdc_new_request() fails: the folio is redirtied, but
>   writeback_count remains incremented.
>
> - After the request is allocated, the fscrypt bounce page
>   allocation fails.  The function ends writeback on the folio
>   and releases the request, but does not drop the
>   writeback_count reference.
>
> - The write is interrupted by a signal (e.g. -ERESTARTSYS).
>   The folio is redirtied and writeback is ended, yet again the
>   counter is left elevated.
>
> Fix the leaks by adding an atomic_long_dec() in each of these
> early return paths, balancing the initial inc.
>
> Cc: stable@xxxxxxxxxxxxxxx
> Fixes: 6390987f2f4c ("ceph: fold ceph_sync_writepages into
> writepage_nounlock")
> Signed-off-by: WenTao Liang <vulab@xxxxxxxxxxx>
> ---
>  fs/ceph/addr.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
> index 0a86f672cc09..dac2b0ae7d37 100644
> --- a/fs/ceph/addr.c
> +++ b/fs/ceph/addr.c
> @@ -790,6 +790,7 @@ static int write_folio_nounlock(struct folio
> *folio,
>       ceph_wbc.truncate_size, true);
>   if (IS_ERR(req)) {
>   folio_redirty_for_writepage(wbc, folio);
> + atomic_long_dec(&fsc->writeback_count);
>   return PTR_ERR(req);
>   }
>  
> @@ -809,6 +810,7 @@ static int write_folio_nounlock(struct folio
> *folio,
>   folio_redirty_for_writepage(wbc, folio);
>   folio_end_writeback(folio);
>   ceph_osdc_put_request(req);
> + atomic_long_dec(&fsc->writeback_count);
>   return PTR_ERR(bounce_page);
>   }
>   }
> @@ -847,6 +849,7 @@ static int write_folio_nounlock(struct folio
> *folio,
>         ceph_vinop(inode), folio);
>   folio_redirty_for_writepage(wbc, folio);
>   folio_end_writeback(folio);
> + atomic_long_dec(&fsc->writeback_count);
>   return err;
>   }
>   if (err == -EBLOCKLISTED)

Maybe, I am missing something. But I have the feeling that we already
received likewise patch and we've discussed this solution. As I
remember correctly, I recommended to use this pattern:

if (atomic_long_dec_return(&fsc->writeback_count) <
CONGESTION_OFF_THRESH(fsc->mount_options->congestion_kb))
fsc->write_congested = false;

Thanks,
Slava.