Re: [PATCH 3/5] target: Fix early transport_generic_handle_tmr abort scenario

From: Christoph Hellwig
Date: Tue Feb 07 2017 - 17:46:13 EST


On Tue, Feb 07, 2017 at 01:17:48PM +0000, Nicholas A. Bellinger wrote:
> From: Nicholas Bellinger <nab@xxxxxxxxxxxxxxx>
>
> This patch fixes a bug where incoming task management requests
> can be explicitly aborted during an active LUN_RESET, but who's
> struct work_struct are canceled in-flight before execution.
>
> This occurs when core_tmr_drain_tmr_list() invokes cancel_work_sync()
> for the incoming se_tmr_req->task_cmd->work, resulting in cmd->work
> for target_tmr_work() never getting invoked and the aborted TMR
> waiting indefinately within transport_wait_for_tasks().
>
> To address this case, perform a CMD_T_ABORTED check early in
> transport_generic_handle_tmr(), and invoke the normal path via
> transport_cmd_check_stop_to_fabric() to complete any TMR kthreads
> blocked waiting for CMD_T_STOP in transport_wait_for_tasks().
>
> Also, move the TRANSPORT_ISTATE_PROCESSING assignment earlier
> into transport_generic_handle_tmr() so the existing check in
> core_tmr_drain_tmr_list() avoids attempting abort the incoming
> se_tmr_req->task_cmd->work if it has already been queued into
> se_device->tmr_wq.
>
> Reported-by: Rob Millner <rlm@xxxxxxxxxxxxx>
> Tested-by: Rob Millner <rlm@xxxxxxxxxxxxx>
> Cc: Rob Millner <rlm@xxxxxxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx # 3.14+
> Signed-off-by: Nicholas Bellinger <nab@xxxxxxxxxxxxxxx>
> ---
> drivers/target/target_core_transport.c | 17 +++++++++++++++--
> 1 file changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
> index 1cadc9e..8b69843 100644
> --- a/drivers/target/target_core_transport.c
> +++ b/drivers/target/target_core_transport.c
> @@ -3110,7 +3110,6 @@ static void target_tmr_work(struct work_struct *work)
> spin_unlock_irqrestore(&cmd->t_state_lock, flags);
> goto check_stop;
> }
> - cmd->t_state = TRANSPORT_ISTATE_PROCESSING;
> spin_unlock_irqrestore(&cmd->t_state_lock, flags);
>
> cmd->se_tfo->queue_tm_rsp(cmd);
> @@ -3123,11 +3122,25 @@ int transport_generic_handle_tmr(
> struct se_cmd *cmd)
> {
> unsigned long flags;
> + bool aborted = false;
>
> spin_lock_irqsave(&cmd->t_state_lock, flags);
> - cmd->transport_state |= CMD_T_ACTIVE;
> + if (cmd->transport_state & CMD_T_ABORTED) {
> + aborted = true;
> + } else {
> + cmd->t_state = TRANSPORT_ISTATE_PROCESSING;
> + cmd->transport_state |= CMD_T_ACTIVE;
> + }
> spin_unlock_irqrestore(&cmd->t_state_lock, flags);
>
> + if (aborted) {
> + pr_warn_ratelimited("handle_tmr caught CMD_T_ABORTED TMR %d"
> + "ref_tag: %llu tag: %llu\n", cmd->se_tmr_req->function,
> + cmd->se_tmr_req->ref_task_tag, cmd->tag);

This seems pretty noisy for something that could happen during normal
operation. Otherwise:

Reviewed-by: Christoph Hellwig <hch@xxxxxx>