Re: [PATCH-next] scsi: fix use-after-free problem in scsi_remove_target

From: James Bottomley
Date: Wed Mar 01 2023 - 14:52:25 EST


On Wed, 2023-03-01 at 11:40 +0800, zhongjinghua wrote:
> ping...
>
> Hello,
>
> Anyone looking this?
>
> 在 2023/3/1 11:37, zhongjinghua 写道:
> > ping...
> >
> > Hello,
> >
> > Anyone looking this?
> >
> > 在 2023/2/13 11:43, Zhong Jinghua 写道:
> > > From: Zhong Jinghua <zhongjinghua@xxxxxxxxxx>
> > >
> > > A use-after-free problem like below:
> > >
> > > BUG: KASAN: use-after-free in scsi_target_reap+0x6c/0x70
> > >
> > > Workqueue: scsi_wq_1 __iscsi_unbind_session
> > > [scsi_transport_iscsi]
> > > Call trace:
> > >   dump_backtrace+0x0/0x320
> > >   show_stack+0x24/0x30
> > >   dump_stack+0xdc/0x128
> > >   print_address_description+0x68/0x278
> > >   kasan_report+0x1e4/0x308
> > >   __asan_report_load4_noabort+0x30/0x40
> > >   scsi_target_reap+0x6c/0x70
> > >   scsi_remove_target+0x430/0x640
> > >   __iscsi_unbind_session+0x164/0x268 [scsi_transport_iscsi]
> > >   process_one_work+0x67c/0x1350
> > >   worker_thread+0x370/0xf90
> > >   kthread+0x2a4/0x320
> > >   ret_from_fork+0x10/0x18
> > >
> > > The problem is caused by a concurrency scenario:
> > >
> > > T0: delete target
> > > // echo 1 >
> > > /sys/devices/platform/host1/session1/target1:0:0/1:0:0:1/delete
> > > T1: logout
> > > // iscsiadm -m node --logout
> > >
> > > T0                            T1
> > >   sdev_store_delete
> > >    scsi_remove_device
> > >     device_remove_file
> > >      __scsi_remove_device
> > >                              __iscsi_unbind_session
> > >                               scsi_remove_target
> > >                           spin_lock_irqsave
> > >                                list_for_each_entry
> > >       scsi_target_reap // starget->reaf 1 -> 0
> > > kref_get(&starget->reap_ref);
> > >                           // warn use-after-free.
> > >                           spin_unlock_irqrestore
> > >        scsi_target_reap_ref_release
> > >     scsi_target_destroy
> > >     ... // delete starget
> > >                           scsi_target_reap
> > >                           // UAF
> > >
> > > When T0 reduces the reference count to 0, but has not been
> > > released,
> > > T1 can still enter list_for_each_entry, and then kref_get reports
> > > UAF.
> > >
> > > Fix it by using kref_get_unless_zero() to check for a reference
> > > count of
> > > 0.
> > >
> > > Signed-off-by: Zhong Jinghua <zhongjinghua@xxxxxxxxxx>
> > > ---
> > >   drivers/scsi/scsi_sysfs.c | 12 +++++++++++-
> > >   1 file changed, 11 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/scsi/scsi_sysfs.c
> > > b/drivers/scsi/scsi_sysfs.c
> > > index e7893835b99a..0ad357ff4c59 100644
> > > --- a/drivers/scsi/scsi_sysfs.c
> > > +++ b/drivers/scsi/scsi_sysfs.c
> > > @@ -1561,7 +1561,17 @@ void scsi_remove_target(struct device
> > > *dev)
> > >               starget->state == STARGET_CREATED_REMOVE)
> > >               continue;
> > >           if (starget->dev.parent == dev || &starget->dev == dev)
> > > {
> > > -            kref_get(&starget->reap_ref);
> > > +
> > > +            /*
> > > +             * If starget->reap_ref is reduced to 0, it means
> > > +             * that other processes are releasing it and
> > > +             * there is no need to delete it again
> > > +             */
> > > +            if (!kref_get_unless_zero(&starget->reap_ref)) {
> > > +                spin_unlock_irqrestore(shost->host_lock, flags);
> > > +                goto restart;

This doesn't seem to be a good idea: you're asking for a live lock
where the thread that's already reduced the refcount to 0 and will
eventually remove the target from the list doesn't progress before you
take the lock again in the restart and then you find the same result
and go round again (and again ...).

Since there should only be one match in the target list and you found
it and know it's going away, what about break instead of unlock and
goto restart?

James