Re: [PATCH net v2 4/4] gve: Make ethtool config changes synchronous
From: Pin-yen Lin
Date: Mon Apr 27 2026 - 16:41:04 EST
On Fri, Apr 24, 2026 at 5:25 PM Harshitha Ramamurthy
<hramamurthy@xxxxxxxxxx> wrote:
>
> From: Pin-yen Lin <treapking@xxxxxxxxxx>
>
> When modifying device features via ethtool, the driver queues the
> carrier status update to its workqueue (gve_wq). This leads to a
> short link-down state after running the ethtool command.
>
> Use `gve_turnup_and_check_status()` instead of `gve_turnup()` in
> `gve_queues_start()` to update the carrier status before returning to
> the userspace.
>
> This was discovered by drivers/net/ping.py selftest. The test calls
> ping command right after an ethtool configuration, but the interface
> could be down without this fix.
>
> Cc: stable@xxxxxxxxxxxxxxx
> Fixes: 5f08cd3d6423 ("gve: Alloc before freeing when adjusting queues")
> Signed-off-by: Pin-yen Lin <treapking@xxxxxxxxxx>
> Reviewed-by: Joshua Washington <joshwash@xxxxxxxxxx>
> Signed-off-by: Harshitha Ramamurthy <hramamurthy@xxxxxxxxxx>
> ---
> drivers/net/ethernet/google/gve/gve_main.c | 56 +++++++++++-----------
> 1 file changed, 28 insertions(+), 28 deletions(-)
>
> diff --git a/drivers/net/ethernet/google/gve/gve_main.c b/drivers/net/ethernet/google/gve/gve_main.c
> index 1fec8e1e4821..2c461be12a75 100644
> --- a/drivers/net/ethernet/google/gve/gve_main.c
> +++ b/drivers/net/ethernet/google/gve/gve_main.c
> @@ -1424,6 +1424,33 @@ static void gve_queues_mem_remove(struct gve_priv *priv)
> priv->rx = NULL;
> }
Sashiko says:
This isn't a bug introduced by this patch, but is there a pre-existing
use-after-free risk here involving stats_report_task?
Looking at gve_stats_report_task(), it iterates over priv->tx and priv->rx
to collect statistics without holding the rtnl_lock. During teardown,
gve_queues_stop() deletes the timer via timer_delete_sync(), but it does
not appear to explicitly cancel the work item with
cancel_work_sync(&priv->stats_report_task).
If the timer queues the work item just before timer_delete_sync() executes,
could the task remain pending on the workqueue?
When gve_close() then calls gve_queues_mem_remove(), the memory for
priv->tx and priv->rx is freed and the pointers are set to NULL. If the
pending stats_report_task runs concurrently and evaluates if (priv->tx)
right before the memory is freed, could it dereference stale pointers and
cause a use-after-free?
Additionally, during module removal, gve_teardown_priv_resources() frees
priv->stats_report and sets it to NULL. Since destroy_workqueue() is
called immediately after, could this flush and execute the pending task,
leading to a NULL pointer dereference?
Sashiko pointed out the same issue at patch 1/4. We will address this in v3.
>
> +static void gve_handle_link_status(struct gve_priv *priv, bool link_status)
> +{
> + if (!gve_get_napi_enabled(priv))
> + return;
> +
> + if (link_status == netif_carrier_ok(priv->dev))
> + return;
> +
> + if (link_status) {
> + netdev_info(priv->dev, "Device link is up.\n");
> + netif_carrier_on(priv->dev);
> + } else {
> + netdev_info(priv->dev, "Device link is down.\n");
> + netif_carrier_off(priv->dev);
> + }
> +}
> +
> +static void gve_turnup_and_check_status(struct gve_priv *priv)
> +{
> + u32 status;
> +
> + gve_turnup(priv);
> + status = ioread32be(&priv->reg_bar0->device_status);
> + gve_handle_link_status(priv,
> + GVE_DEVICE_STATUS_LINK_STATUS_MASK & status);
> +}
> +
> /* The passed-in queue memory is stored into priv and the queues are made live.
> * No memory is allocated. Passed-in memory is freed on errors.
> */
> @@ -1486,8 +1513,7 @@ static int gve_queues_start(struct gve_priv *priv,
> round_jiffies(jiffies +
> msecs_to_jiffies(priv->stats_report_timer_period)));
>
> - gve_turnup(priv);
> - queue_work(priv->gve_wq, &priv->service_task);
> + gve_turnup_and_check_status(priv);
> priv->interface_up_cnt++;
> return 0;
>
> @@ -1608,23 +1634,6 @@ static int gve_close(struct net_device *dev)
> return 0;
> }
>
> -static void gve_handle_link_status(struct gve_priv *priv, bool link_status)
> -{
> - if (!gve_get_napi_enabled(priv))
> - return;
> -
> - if (link_status == netif_carrier_ok(priv->dev))
> - return;
> -
> - if (link_status) {
> - netdev_info(priv->dev, "Device link is up.\n");
> - netif_carrier_on(priv->dev);
> - } else {
> - netdev_info(priv->dev, "Device link is down.\n");
> - netif_carrier_off(priv->dev);
> - }
> -}
> -
> static int gve_configure_rings_xdp(struct gve_priv *priv,
> u16 num_xdp_rings)
> {
> @@ -2099,15 +2108,6 @@ static void gve_turnup(struct gve_priv *priv)
> gve_set_napi_enabled(priv);
> }
>
> -static void gve_turnup_and_check_status(struct gve_priv *priv)
> -{
> - u32 status;
> -
> - gve_turnup(priv);
> - status = ioread32be(&priv->reg_bar0->device_status);
> - gve_handle_link_status(priv, GVE_DEVICE_STATUS_LINK_STATUS_MASK & status);
> -}
> -
> static struct gve_notify_block *gve_get_tx_notify_block(struct gve_priv *priv,
> unsigned int txqueue)
> {
> --
> 2.54.0.545.g6539524ca2-goog
>
Regards,
Pin-yen