Re: [PATCH net 1/1] hv_netvsc: Calculate correct ring size when PAGE_SIZE is not 4 Kbytes

From: Simon Horman
Date: Wed Jan 24 2024 - 05:30:24 EST


On Tue, Jan 23, 2024 at 05:13:12PM +0000, Michael Kelley wrote:
> From: Simon Horman @ 2024-01-22 20:49 UTC (permalink / raw)
> >
> > On Mon, Jan 22, 2024 at 08:20:28AM -0800, mhkelley58@xxxxxxxxx wrote:
> > > From: Michael Kelley <mhklinux@xxxxxxxxxxx>
> > >
> > > Current code in netvsc_drv_init() incorrectly assumes that PAGE_SIZE
> > > is 4 Kbytes, which is wrong on ARM64 with 16K or 64K page size. As a
> > > result, the default VMBus ring buffer size on ARM64 with 64K page size
> > > is 8 Mbytes instead of the expected 512 Kbytes. While this doesn't break
> > > anything, a typical VM with 8 vCPUs and 8 netvsc channels wastes 120
> > > Mbytes (8 channels * 2 ring buffers/channel * 7.5 Mbytes/ring buffer).
> > >
> > > Unfortunately, the module parameter specifying the ring buffer size
> > > is in units of 4 Kbyte pages. Ideally, it should be in units that
> > > are independent of PAGE_SIZE, but backwards compatibility prevents
> > > changing that now.
> > >
> > > Fix this by having netvsc_drv_init() hardcode 4096 instead of using
> > > PAGE_SIZE when calculating the ring buffer size in bytes. Also
> > > use the VMBUS_RING_SIZE macro to ensure proper alignment when running
> > > with page size larger than 4K.
> > >
> > > Cc: <stable@xxxxxxxxxxxxxxx> # 5.15.x
> > > Signed-off-by: Michael Kelley <mhklinux@xxxxxxxxxxx>
> >
> > Hi Michael,
> >
> > As a bug fix this probably warrants a fixes tag.
> > Perhaps this is appropriate?
> >
> > Fixes: 450d7a4b7ace ("Staging: hv: ring parameter")
> >
>
> [This email is cobbled together because for some reason I didn't directly
> receive your original reply. So it won't thread correctly with yours.]
>
> I thought about a Fixes: tag, but the situation is a bit weird. The original
> code was correct enough at the time it was written in 2010 because Hyper-V
> only ran on x86/x64 with a 4 Kbyte guest page size. In fact, all the Hyper-V
> guest code in the Linux kernel tended to assume a 4 Kbyte page size.
> During 2019 and 2020, I and others made changes to remove this
> assumption, in prep for running Hyper-V Linux guests on ARM64. The
> ARM64 support was finally enabled with commit 7aff79e297ee in August
> 2021 for the 5.15 kernel. Somehow we missed fixing this case in the netvsc
> driver, and a similar case in the Hyper-V synthetic storage driver (see [1]).
>
> As a result, there's no point in backporting this fix to anything earlier than
> 5.15, because there's no ARM64 support for Hyper-V guests in earlier kernels.
> So picking a "Fixes:" commit from back in 2010 doesn't seem helpful. I could
> see doing
>
> Fixes: 7aff79e297ee ("Drivers: hv: Enable Hyper-V code to be built on ARM64")
>
> But the connection between that commit and this fix isn't very evident, so I
> opt'ed for just putting the 5.15.x notation on the Cc: stable@xxxxxxxxxxxxxxx
> line. That said, I don't feel strongly about it. I'm just trying to do what's best
> for the stable branch maintainers and avoid generating backports to kernel
> versions where it doesn't matter.

Thanks for the explanation.

FWIIW, I would probably have gone for the tag above (7aff79e297ee)
as presumably that is when the bug started manifesting.
But I appreciate that it isn't straightforward.