Re: [PATCH-next] scsi: fix use-after-free problem in scsi_remove_target

From: zhongjinghua
Date: Thu Mar 02 2023 - 03:56:51 EST



在 2023/3/2 3:51, James Bottomley 写道:
On Wed, 2023-03-01 at 11:40 +0800, zhongjinghua wrote:
ping...

Hello,

Anyone looking this?

在 2023/3/1 11:37, zhongjinghua 写道:
ping...

Hello,

Anyone looking this?

在 2023/2/13 11:43, Zhong Jinghua 写道:
From: Zhong Jinghua <zhongjinghua@xxxxxxxxxx>

A use-after-free problem like below:

BUG: KASAN: use-after-free in scsi_target_reap+0x6c/0x70

Workqueue: scsi_wq_1 __iscsi_unbind_session
[scsi_transport_iscsi]
Call trace:
  dump_backtrace+0x0/0x320
  show_stack+0x24/0x30
  dump_stack+0xdc/0x128
  print_address_description+0x68/0x278
  kasan_report+0x1e4/0x308
  __asan_report_load4_noabort+0x30/0x40
  scsi_target_reap+0x6c/0x70
  scsi_remove_target+0x430/0x640
  __iscsi_unbind_session+0x164/0x268 [scsi_transport_iscsi]
  process_one_work+0x67c/0x1350
  worker_thread+0x370/0xf90
  kthread+0x2a4/0x320
  ret_from_fork+0x10/0x18

The problem is caused by a concurrency scenario:

T0: delete target
// echo 1 >
/sys/devices/platform/host1/session1/target1:0:0/1:0:0:1/delete
T1: logout
// iscsiadm -m node --logout

T0                            T1
  sdev_store_delete
   scsi_remove_device
    device_remove_file
     __scsi_remove_device
                             __iscsi_unbind_session
                              scsi_remove_target
                          spin_lock_irqsave
                               list_for_each_entry
      scsi_target_reap // starget->reaf 1 -> 0
kref_get(&starget->reap_ref);
                          // warn use-after-free.
                          spin_unlock_irqrestore
       scsi_target_reap_ref_release
    scsi_target_destroy
    ... // delete starget
                          scsi_target_reap
                          // UAF

When T0 reduces the reference count to 0, but has not been
released,
T1 can still enter list_for_each_entry, and then kref_get reports
UAF.

Fix it by using kref_get_unless_zero() to check for a reference
count of
0.

Signed-off-by: Zhong Jinghua <zhongjinghua@xxxxxxxxxx>
---
  drivers/scsi/scsi_sysfs.c | 12 +++++++++++-
  1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_sysfs.c
b/drivers/scsi/scsi_sysfs.c
index e7893835b99a..0ad357ff4c59 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -1561,7 +1561,17 @@ void scsi_remove_target(struct device
*dev)
              starget->state == STARGET_CREATED_REMOVE)
              continue;
          if (starget->dev.parent == dev || &starget->dev == dev)
{
-            kref_get(&starget->reap_ref);
+
+            /*
+             * If starget->reap_ref is reduced to 0, it means
+             * that other processes are releasing it and
+             * there is no need to delete it again
+             */
+            if (!kref_get_unless_zero(&starget->reap_ref)) {
+                spin_unlock_irqrestore(shost->host_lock, flags);
+                goto restart;
This doesn't seem to be a good idea: you're asking for a live lock
where the thread that's already reduced the refcount to 0 and will
eventually remove the target from the list doesn't progress before you
take the lock again in the restart and then you find the same result
and go round again (and again ...).

I agree with this, no need to use goto restart.

Since there should only be one match in the target list and you found
it and know it's going away, what about break instead of unlock and
goto restart?

Wouldn't it be better to use continue? Assuming that the device parameter is a session, maybe not only one is matched.


James

Thanks.

Jinghua.