Re: [PATCH] btrfs: keep `priv` struct on stack for sync reads in `btrfs_encoded_read_regular_fill_pages()`

From: Daniel Vacek
Date: Thu Jan 09 2025 - 03:32:17 EST


On Wed, 8 Jan 2025 at 19:27, David Sterba <dsterba@xxxxxxx> wrote:
>
> On Wed, Jan 08, 2025 at 04:24:19PM +0100, Daniel Vacek wrote:
> > On Wed, 8 Jan 2025 at 13:42, Johannes Thumshirn
> > <Johannes.Thumshirn@xxxxxxx> wrote:
> > > On 08.01.25 12:44, Daniel Vacek wrote:
> > > > Only allocate the `priv` struct from slab for asynchronous mode.
> > > >
> > > > There's no need to allocate an object from slab in the synchronous mode. In
> > > > such a case stack can be happily used as it used to be before 68d3b27e05c7
> > > > ("btrfs: move priv off stack in btrfs_encoded_read_regular_fill_pages()")
> > > > which was a preparation for the async mode.
> > > >
> > > > While at it, fix the comment to reflect the atomic => refcount change in
> > > > d29662695ed7 ("btrfs: fix use-after-free waiting for encoded read endios").
> > >
> > > Generally I'm not a huge fan of conditional allocation/freeing. It just
> > > complicates matters. I get it in case of the bio's bi_inline_vecs where
> > > it's a optimization, but I fail to see why it would make a difference in
> > > this case.
> > >
> > > If we're really going down that route, there should at least be a
> > > justification other than "no need" to.
> >
> > Well the main motivation was not to needlessly exercise the slab
> > allocator when IO uring is not used. It is a bit of an overhead,
> > though the object is not really big so I guess it's not a big deal
> > after all (the slab should manage just fine even under low memory
> > conditions).
> >
> > 68d3b27e05c7 added the allocation for the async mode but also changed
> > the original behavior of the sync mode which was using stack before.
> > The async mode indeed requires the allocation as the object's lifetime
> > extends over the function's one. The sync mode is perfectly contained
> > within as it always was.
> >
> > Simply, I tend not to do any allocations which are not strictly
> > needed.
>
> Nothing wrong about that and I think in kernel it's preferred to avoid
> allocations in such cases. The sync case is calling the ioctl and
> repeatedly too, so reducing the allocator overhead makes sense, at the
> slight cost of code complexity. The worst case is when allocator has to
> free memory by flushing or blocking the tasks, so we're actively
> avoiding and reducing the chances for that.
>
> > If you prefer to simply allocate the object unconditionally,
> > we can just drop this patch.
> >
> > > > struct btrfs_encoded_read_private {
> > > > - struct completion done;
> > > > + struct completion *sync_reads;
> > > > void *uring_ctx;
> > > > - refcount_t pending_refs;
> > > > + refcount_t pending_reads;
> > > > blk_status_t status;
> > > > };
> > >
> > > These renames just make the diff harder to read (and yes I shouldn't
> > > have renamed pending to pending_refs but that at least also changed the
> > > type).
> >
> > I see. But also the completion is changed to a pointer here for a
> > reason and I tried to make it clear it's only used for sync reads,
> > hence the rename.
>
> Functional and cleanup/cosmetic changes are better kept separate. It's
> keeping the scope of changes minimal and usually the fixes can get
> backported so the cleanup changes are not wanted.

Right, I see. Noted.

Though in this case there is no fix and I don't see a reason for
backporting this one.
But I guess you were talking about eventual other possible fixes in the area.

> > > > - if (refcount_dec_and_test(&priv->pending_refs)) {
> > > > - int err = blk_status_to_errno(READ_ONCE(priv->status));
> > > > -
> > > > + if (refcount_dec_and_test(&priv->pending_reads)) {
> > > > if (priv->uring_ctx) {
> > > > + int err = blk_status_to_errno(READ_ONCE(priv->status));
> > >
> > > Missing newline after the declaration.
> >
> > Right. Clearly I did not run checkpatch before sending. Sorry.
> > I was just trying to match this block to the same one later, which did
> > not have the newline. (But it also did not have the declaration
> > before.)
>
> We don't stick to what checkpatch reports, I'm fixing a lot of coding
> style when merging patches that would not pass checkpatch, the general
> CodingStyle applies but we have a local fs/btrfs style too, loosely
> documented at https://btrfs.readthedocs.io/en/latest/dev/Development-notes.html .

Bookmarked, thanks.

--nX

On Wed, 8 Jan 2025 at 19:27, David Sterba <dsterba@xxxxxxx> wrote:
>
> On Wed, Jan 08, 2025 at 04:24:19PM +0100, Daniel Vacek wrote:
> > On Wed, 8 Jan 2025 at 13:42, Johannes Thumshirn
> > <Johannes.Thumshirn@xxxxxxx> wrote:
> > > On 08.01.25 12:44, Daniel Vacek wrote:
> > > > Only allocate the `priv` struct from slab for asynchronous mode.
> > > >
> > > > There's no need to allocate an object from slab in the synchronous mode. In
> > > > such a case stack can be happily used as it used to be before 68d3b27e05c7
> > > > ("btrfs: move priv off stack in btrfs_encoded_read_regular_fill_pages()")
> > > > which was a preparation for the async mode.
> > > >
> > > > While at it, fix the comment to reflect the atomic => refcount change in
> > > > d29662695ed7 ("btrfs: fix use-after-free waiting for encoded read endios").
> > >
> > > Generally I'm not a huge fan of conditional allocation/freeing. It just
> > > complicates matters. I get it in case of the bio's bi_inline_vecs where
> > > it's a optimization, but I fail to see why it would make a difference in
> > > this case.
> > >
> > > If we're really going down that route, there should at least be a
> > > justification other than "no need" to.
> >
> > Well the main motivation was not to needlessly exercise the slab
> > allocator when IO uring is not used. It is a bit of an overhead,
> > though the object is not really big so I guess it's not a big deal
> > after all (the slab should manage just fine even under low memory
> > conditions).
> >
> > 68d3b27e05c7 added the allocation for the async mode but also changed
> > the original behavior of the sync mode which was using stack before.
> > The async mode indeed requires the allocation as the object's lifetime
> > extends over the function's one. The sync mode is perfectly contained
> > within as it always was.
> >
> > Simply, I tend not to do any allocations which are not strictly
> > needed.
>
> Nothing wrong about that and I think in kernel it's preferred to avoid
> allocations in such cases. The sync case is calling the ioctl and
> repeatedly too, so reducing the allocator overhead makes sense, at the
> slight cost of code complexity. The worst case is when allocator has to
> free memory by flushing or blocking the tasks, so we're actively
> avoiding and reducing the chances for that.
>
> > If you prefer to simply allocate the object unconditionally,
> > we can just drop this patch.
> >
> > > > struct btrfs_encoded_read_private {
> > > > - struct completion done;
> > > > + struct completion *sync_reads;
> > > > void *uring_ctx;
> > > > - refcount_t pending_refs;
> > > > + refcount_t pending_reads;
> > > > blk_status_t status;
> > > > };
> > >
> > > These renames just make the diff harder to read (and yes I shouldn't
> > > have renamed pending to pending_refs but that at least also changed the
> > > type).
> >
> > I see. But also the completion is changed to a pointer here for a
> > reason and I tried to make it clear it's only used for sync reads,
> > hence the rename.
>
> Functional and cleanup/cosmetic changes are better kept separate. It's
> keeping the scope of changes minimal and usually the fixes can get
> backported so the cleanup changes are not wanted.
>
> > > > - if (refcount_dec_and_test(&priv->pending_refs)) {
> > > > - int err = blk_status_to_errno(READ_ONCE(priv->status));
> > > > -
> > > > + if (refcount_dec_and_test(&priv->pending_reads)) {
> > > > if (priv->uring_ctx) {
> > > > + int err = blk_status_to_errno(READ_ONCE(priv->status));
> > >
> > > Missing newline after the declaration.
> >
> > Right. Clearly I did not run checkpatch before sending. Sorry.
> > I was just trying to match this block to the same one later, which did
> > not have the newline. (But it also did not have the declaration
> > before.)
>
> We don't stick to what checkpatch reports, I'm fixing a lot of coding
> style when merging patches that would not pass checkpatch, the general
> CodingStyle applies but we have a local fs/btrfs style too, loosely
> documented at https://btrfs.readthedocs.io/en/latest/dev/Development-notes.html .