RE: [RFC] sunrpc: Fix race between work-queue and rpc_killall_tasks.
From: Myklebust, Trond
Date: Fri Jul 08 2011 - 14:12:01 EST
> -----Original Message-----
> From: Ben Greear [mailto:greearb@xxxxxxxxxxxxxxx]
> Sent: Friday, July 08, 2011 1:19 PM
> To: Myklebust, Trond
> Cc: linux-nfs@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
> Subject: Re: [RFC] sunrpc: Fix race between work-queue and
> rpc_killall_tasks.
>
> On 07/06/2011 04:45 PM, Trond Myklebust wrote:
> > On Wed, 2011-07-06 at 15:49 -0700, greearb@xxxxxxxxxxxxxxx wrote:
> >> From: Ben Greear<greearb@xxxxxxxxxxxxxxx>
> >>
> >> The rpc_killall_tasks logic is not locked against
> >> the work-queue thread, but it still directly modifies
> >> function pointers and data in the task objects.
> >>
> >> This patch changes the killall-tasks logic to set a flag
> >> that tells the work-queue thread to terminate the task
> >> instead of directly calling the terminate logic.
> >>
> >> Signed-off-by: Ben Greear<greearb@xxxxxxxxxxxxxxx>
> >> ---
> >>
> >> NOTE: This needs review, as I am still struggling to understand
> >> the rpc code, and it's quite possible this patch either doesn't
> >> fully fix the problem or actually causes other issues. That said,
> >> my nfs stress test seems to run a bit more stable with this patch
> applied.
> >
> > Yes, but I don't see why you are adding a new flag, nor do I see why
> we
> > want to keep checking for that flag in the rpc_execute() loop.
> > rpc_killall_tasks() is not a frequent operation that we want to
> optimise
> > for.
> >
> > How about the following instead?
>
> Ok, I looked at your patch closer. I think it can still cause
> bad race conditions.
>
> For instance:
>
> Assume that tk_callback is NULL at beginning of while loop in
> __rpc_execute,
> and tk_action is rpc_exit_task.
>
> While do_action(task) is being called, tk_action is set to NULL in
> rpc_exit_task.
>
> But, right after tk_action is set to NULL in rpc_exit_task, the
> rpc_killall_tasks
> method calls rpc_exit, which sets tk_action back to rpc_exit_task.
>
> I believe this could cause the xprt_release(task) logic to be called in
> the
> work-queue's execution of rpc_exit_task due to tk_action != NULL when
> it should not be.
Why would this be a problem? xprt_release() can certainly be called multiple times on an rpc_task. Ditto rpbc_getport_done.
The only thing that is not re-entrant there is rpcb_map_release, which should only ever be called once whether or not something calls rpc_killall_tasks.
> I have no hard evidence this exact scenario is happening in my case,
> but I
> believe the code is still racy with your patch.
>
> For that matter, is it safe to modify the flags in rpc_killall_tasks:
>
> rovr->tk_flags |= RPC_TASK_KILLED;
>
> Is that guaranteed to be atomic with any other modification of flags?
Task->tk_flags should never change after the rpc_task is set up. The only allowed change is the RPC_TASK_KILLED. We could convert that into an atomic bit in task->tk_runstate, but again, this isn't something that is likely to be responsible for the problem you are seeing.
¢éì®&Þ~º&¶¬+-±éÝ¥w®Ë±Êâmébìdz¹Þ)í
æèw*jg¬±¨¶Ýj/êäz¹Þà2Þ¨èÚ&¢)ß«a¶Úþø®G«éh®æj:+v¨wèÙ>W±êÞiÛaxPjØm¶ÿÃ-»+ùd_