Re: [PATCH v5 06/10] scsi: ufshpb: Add hpb dev reset response

From: Can Guo
Date: Wed Mar 17 2021 - 09:52:12 EST


On 2021-03-17 20:22, Avri Altman wrote:

On 2021-03-17 19:23, Avri Altman wrote:
>>
>> On 2021-03-02 21:24, Avri Altman wrote:
>> > The spec does not define what is the host's recommended response when
>> > the device send hpb dev reset response (oper 0x2).
>> >
>> > We will update all active hpb regions: mark them and do that on the
>> > next
>> > read.
>> >
>> > Signed-off-by: Avri Altman <avri.altman@xxxxxxx>
>> > ---
>> > drivers/scsi/ufs/ufshpb.c | 47
>> ++++++++++++++++++++++++++++++++++++---
>> > drivers/scsi/ufs/ufshpb.h | 2 ++
>> > 2 files changed, 46 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c
>> > index 0744feb4d484..0034fa03fdc6 100644
>> > --- a/drivers/scsi/ufs/ufshpb.c
>> > +++ b/drivers/scsi/ufs/ufshpb.c
>> > @@ -642,7 +642,8 @@ int ufshpb_prep(struct ufs_hba *hba, struct
>> > ufshcd_lrb *lrbp)
>> > if (rgn->reads == ACTIVATION_THRESHOLD)
>> > activate = true;
>> > spin_unlock_irqrestore(&rgn->rgn_lock, flags);
>> > - if (activate) {
>> > + if (activate ||
>> > + test_and_clear_bit(RGN_FLAG_UPDATE, &rgn->rgn_flags)) {

Other than this place, do we also need to clear this bit in places like
ufshpb_map_req_compl_fn() and/or ufshpb_cleanup_lru_info()? Otherwise,
this flag may be left there even after the rgn is inactivated.

>> > spin_lock_irqsave(&hpb->rsp_list_lock, flags);
>> > ufshpb_update_active_info(hpb, rgn_idx, srgn_idx);
>> > hpb->stats.rb_active_cnt++;
>> > @@ -1480,6 +1481,20 @@ void ufshpb_rsp_upiu(struct ufs_hba *hba,
>> > struct ufshcd_lrb *lrbp)
>> > case HPB_RSP_DEV_RESET:
>> > dev_warn(&hpb->sdev_ufs_lu->sdev_dev,
>> > "UFS device lost HPB information during PM.\n");
>> > +
>> > + if (hpb->is_hcm) {
>> > + struct scsi_device *sdev;
>> > +
>> > + __shost_for_each_device(sdev, hba->host) {
>> > + struct ufshpb_lu *h = sdev->hostdata;
>> > +
>> > + if (!h)
>> > + continue;
>> > +
>> > + schedule_work(&hpb->ufshpb_lun_reset_work);
>> > + }
>> > + }
>> > +
>> > break;
>> > default:
>> > dev_notice(&hpb->sdev_ufs_lu->sdev_dev,
>> > @@ -1594,6 +1609,25 @@ static void
>> > ufshpb_run_inactive_region_list(struct ufshpb_lu *hpb)
>> > spin_unlock_irqrestore(&hpb->rsp_list_lock, flags);
>> > }
>> >
>> > +static void ufshpb_reset_work_handler(struct work_struct *work)
>>
>> Just curious, directly doing below things inside ufshpb_rsp_upiu()
>> does
>> not
>> seem a problem to me, does this really deserve a separate work?
> I don't know, I never even consider of doing this.
> The active region list may contain up to few thousands of regions -
> It is not rare to see configurations that covers the entire device.
>

Yes, true, it can be a huge list. But what does the ops
"HPB_RSP_DEV_RESET"
really mean? The specs says "Device reset HPB Regions information", but
I
don't know what is really happening. Could you please elaborate?
It means that the device informs the host that the L2P cache is no longer valid.
The spec doesn't say what to do in that case.

Then it means that all the clean (without DIRTY flag set) HPB entries (ppns)
in active rgns in host memory side may not be valid to the device anymore.
Please correct me if I am wrong.

We thought that in host mode, it make sense to update all the active regions.

But current logic does not set the state of the sub-regions (in active regions) to
INVALID, it only marks all active regions as UPDATE.

Although one of subsequent read cmds shall put the sub-region back to activate_list,
ufshpb_test_ppn_dirty() can still return false, thus these read cmds still think the
ppns are valid and they shall move forward to send HPB Write Buffer (buffer id = 0x2,
in case of HPB2.0) and HPB Read cmds.

HPB Read cmds with invalid ppns will be treated as normal Read cmds by device as the
specs says, but what would happen to HPB Write Buffer cmds (buffer id = 0x2, in case
of HPB2.0) with invalid ppns? Can this be a real problem?


I think I will go with your suggestion.
Effectively, in host mode, since it is deactivating "cold" regions,
the lru list is kept relatively small, and contains only "hot" regions.

hmm... I don't really have a idea on this, please go with whatever you and Daejun think is fine here.

Thanks,
Can Guo.


Thanks,
Avri


Thanks,
Can Guo.

> But yes, I can do that.
> Better to get ack from Daejun first.
>
> Thanks,
> Avri
>
>>
>> Thanks,
>> Can Guo.
>>
>> > +{
>> > + struct ufshpb_lu *hpb;
>> > + struct victim_select_info *lru_info;
>> > + struct ufshpb_region *rgn;
>> > + unsigned long flags;
>> > +
>> > + hpb = container_of(work, struct ufshpb_lu, ufshpb_lun_reset_work);
>> > +
>> > + lru_info = &hpb->lru_info;
>> > +
>> > + spin_lock_irqsave(&hpb->rgn_state_lock, flags);
>> > +
>> > + list_for_each_entry(rgn, &lru_info->lh_lru_rgn, list_lru_rgn)
>> > + set_bit(RGN_FLAG_UPDATE, &rgn->rgn_flags);
>> > +
>> > + spin_unlock_irqrestore(&hpb->rgn_state_lock, flags);
>> > +}
>> > +
>> > static void ufshpb_normalization_work_handler(struct work_struct
>> > *work)
>> > {
>> > struct ufshpb_lu *hpb;
>> > @@ -1798,6 +1832,8 @@ static int ufshpb_alloc_region_tbl(struct
>> > ufs_hba *hba, struct ufshpb_lu *hpb)
>> > } else {
>> > rgn->rgn_state = HPB_RGN_INACTIVE;
>> > }
>> > +
>> > + rgn->rgn_flags = 0;
>> > }
>> >
>> > return 0;
>> > @@ -2012,9 +2048,12 @@ static int ufshpb_lu_hpb_init(struct ufs_hba
>> > *hba, struct ufshpb_lu *hpb)
>> > INIT_LIST_HEAD(&hpb->list_hpb_lu);
>> >
>> > INIT_WORK(&hpb->map_work, ufshpb_map_work_handler);
>> > - if (hpb->is_hcm)
>> > + if (hpb->is_hcm) {
>> > INIT_WORK(&hpb->ufshpb_normalization_work,
>> > ufshpb_normalization_work_handler);
>> > + INIT_WORK(&hpb->ufshpb_lun_reset_work,
>> > + ufshpb_reset_work_handler);
>> > + }
>> >
>> > hpb->map_req_cache = kmem_cache_create("ufshpb_req_cache",
>> > sizeof(struct ufshpb_req), 0, 0, NULL);
>> > @@ -2114,8 +2153,10 @@ static void ufshpb_discard_rsp_lists(struct
>> > ufshpb_lu *hpb)
>> >
>> > static void ufshpb_cancel_jobs(struct ufshpb_lu *hpb)
>> > {
>> > - if (hpb->is_hcm)
>> > + if (hpb->is_hcm) {
>> > + cancel_work_sync(&hpb->ufshpb_lun_reset_work);
>> > cancel_work_sync(&hpb->ufshpb_normalization_work);
>> > + }
>> > cancel_work_sync(&hpb->map_work);
>> > }
>> >
>> > diff --git a/drivers/scsi/ufs/ufshpb.h b/drivers/scsi/ufs/ufshpb.h
>> > index 84598a317897..37c1b0ea0c0a 100644
>> > --- a/drivers/scsi/ufs/ufshpb.h
>> > +++ b/drivers/scsi/ufs/ufshpb.h
>> > @@ -121,6 +121,7 @@ struct ufshpb_region {
>> > struct list_head list_lru_rgn;
>> > unsigned long rgn_flags;
>> > #define RGN_FLAG_DIRTY 0
>> > +#define RGN_FLAG_UPDATE 1
>> >
>> > /* region reads - for host mode */
>> > spinlock_t rgn_lock;
>> > @@ -217,6 +218,7 @@ struct ufshpb_lu {
>> > /* for selecting victim */
>> > struct victim_select_info lru_info;
>> > struct work_struct ufshpb_normalization_work;
>> > + struct work_struct ufshpb_lun_reset_work;
>> >
>> > /* pinned region information */
>> > u32 lu_pinned_start;