Re: [PATCH] bcachefs: Relocate rcu_read_lock()/rcu_read_unlock() to snapshot_t()

From: Kent Overstreet
Date: Sat Sep 21 2024 - 12:33:04 EST


On Thu, Sep 19, 2024 at 10:36:03AM GMT, Ahmed Ehab wrote:
> Syzbot reports a problem that a warning is triggered due to suspicious
> use of rcu_dereference_check(). That is triggered by a call of
> bch2_snapshot_tree_oldest_subvol().
>
> The cause of the warning is that the rcu_read_lock() is called in wrapper
> methods instead of calling it directly before calling rcu_dereference()
> in snapshot_t().For example in this case, snapshot_t() is called
> directly from bch2_snapshot_tree_oldest_subvol() without holding the
> read lock. This also results in duplicating the rcu_read_lock()
> and rcu_read_unlock() calls, which may lead to future errors in the case
> of forgetting to hold the read locks as in this case.
>
> To fix this, move rcu_read_lock() and rcu_read_unlock() to snapshot_t().
> This will make sure that rcu_dereference_check() is never called without
> holding the read lock.
>
> Reported-by: <syzbot+f7c41a878676b72c16a6@xxxxxxxxxxxxxxxxxxxxxxxxx>
> Signed-off-by: Ahmed Ehab <bottaawesome633@xxxxxxxxx>
> ---
> fs/bcachefs/snapshot.h | 26 +++++++-------------------
> 1 file changed, 7 insertions(+), 19 deletions(-)
>
> diff --git a/fs/bcachefs/snapshot.h b/fs/bcachefs/snapshot.h
> index eb5ef64221d6..04f18fac56fe 100644
> --- a/fs/bcachefs/snapshot.h
> +++ b/fs/bcachefs/snapshot.h
> @@ -42,15 +42,19 @@ static inline struct snapshot_t *__snapshot_t(struct snapshot_table *t, u32 id)
>
> static inline const struct snapshot_t *snapshot_t(struct bch_fs *c, u32 id)
> {
> - return __snapshot_t(rcu_dereference(c->snapshots), id);
> + struct snapshot_table *temp;
> +
> + rcu_read_lock();
> + temp = rcu_dereference(c->snapshots);
> + rcu_read_unlock();
> +
> + return __snapshot_t(temp, id);

This is very wrong - as in, you need to study up on how RCU works.

We need to be holding rcu_read_lock() while we're accessing the object
we got to from the rcu pointer, so rcu_read_lock() always needs to be
taken by the caller.