Re: [PATCH] xprt.c use after free of work_structs

From: Trond Myklebust
Date: Mon May 02 2005 - 15:26:11 EST


su den 01.05.2005 Klokka 00:02 (-0600) skreiv Zwane Mwaikambo:
> This bug was first observed in 2.6.11-rc1-mm2 but i couldn't find the
> exact patch which would unmask it. The work_structs embedded in rpc_xprt
> are freed in xprt_destroy without waiting for all scheduled work to be
> completed, resulting in quite a kerfuffle. Since xprt->timer callback can
> schedule new work, flush the workqueue after killing the timer.

Hi Zwane,

Thanks, I fully agree that this is needed.

Chuck proposed a similar patch to me a couple of days ago, however he
also pointed out that we need to call cancel_delayed_work() on
xprt->sock_connect in the same code section in order to avoid trouble
with the TCP reconnect code causing the same type of race. I've attached
his mail.

Cheers,
Trond
--
Trond Myklebust <trond.myklebust@xxxxxxxxxx>
--- Begin Message --- Make the socket transport kick the event queue to start socket connects
immediately. This should improve responsiveness of applications that are
sensitive to slow mount operations (like automounters).

We are now also careful to cancel the connect worker before destroying
the xprt. This eliminates a race where xprt_destroy can finish before
the connect worker is even allowed to run.

Test-plan:
Destructive testing (unplugging the network temporarily). Connectathon
with UDP and TCP. Hard-code impossibly small connect timeout.

Version: Fri, 29 Apr 2005 15:32:01 -0400

Signed-off-by: Chuck Lever <cel@xxxxxxxxxx>
---

net/sunrpc/xprt.c | 9 ++++++++-
1 files changed, 8 insertions(+), 1 deletion(-)


diff -X /home/cel/src/linux/dont-diff -Naurp 10-rpc-reconnect/net/sunrpc/xprt.c 11-xprt-flush-connects/net/sunrpc/xprt.c
--- 10-rpc-reconnect/net/sunrpc/xprt.c 2005-04-29 15:18:47.677108000 -0400
+++ 11-xprt-flush-connects/net/sunrpc/xprt.c 2005-04-29 15:29:36.637250000 -0400
@@ -569,8 +569,11 @@ void xprt_connect(struct rpc_task *task)
if (xprt->sock != NULL)
schedule_delayed_work(&xprt->sock_connect,
RPC_REESTABLISH_TIMEOUT);
- else
+ else {
schedule_work(&xprt->sock_connect);
+ if (!RPC_IS_ASYNC(task))
+ flush_scheduled_work();
+ }
}
return;
out_write:
@@ -1666,6 +1669,10 @@ xprt_shutdown(struct rpc_xprt *xprt)
rpc_wake_up(&xprt->backlog);
wake_up(&xprt->cong_wait);
del_timer_sync(&xprt->timer);
+
+ /* synchronously wait for connect worker to finish */
+ cancel_delayed_work(&xprt->sock_connect);
+ flush_scheduled_work();
}

/*

--- End Message ---