RE: [PATCH 1/2] iommu/vt-d: Support batching IOTLB/dev-IOTLB invalidation commands

From: Zhang, Tina
Date: Tue Jun 04 2024 - 06:16:14 EST




> -----Original Message-----
> From: Baolu Lu <baolu.lu@xxxxxxxxxxxxxxx>
> Sent: Tuesday, June 4, 2024 3:01 PM
> To: Zhang, Tina <tina.zhang@xxxxxxxxx>; Tian, Kevin <kevin.tian@xxxxxxxxx>
> Cc: baolu.lu@xxxxxxxxxxxxxxx; iommu@xxxxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH 1/2] iommu/vt-d: Support batching IOTLB/dev-IOTLB
> invalidation commands
>
> On 6/4/24 1:59 PM, Zhang, Tina wrote:
> >> -----Original Message-----
> >> From: Baolu Lu<baolu.lu@xxxxxxxxxxxxxxx>
> >> Sent: Tuesday, June 4, 2024 9:15 AM
> >> To: Zhang, Tina<tina.zhang@xxxxxxxxx>; Tian,
> >> Kevin<kevin.tian@xxxxxxxxx>
> >> Cc:baolu.lu@xxxxxxxxxxxxxxx;iommu@xxxxxxxxxxxxxxx; linux-
> >> kernel@xxxxxxxxxxxxxxx
> >> Subject: Re: [PATCH 1/2] iommu/vt-d: Support batching IOTLB/dev-IOTLB
> >> invalidation commands
> >>
> >> On 6/3/24 3:37 PM, Zhang, Tina wrote:
> >>>> -----Original Message-----
> >>>> From: Baolu Lu<baolu.lu@xxxxxxxxxxxxxxx>
> >>>> Sent: Sunday, May 19, 2024 5:43 PM
> >>>> To: Zhang, Tina<tina.zhang@xxxxxxxxx>; Tian,
> >>>> Kevin<kevin.tian@xxxxxxxxx>
> >>>> Cc:baolu.lu@xxxxxxxxxxxxxxx;iommu@xxxxxxxxxxxxxxx; linux-
> >>>> kernel@xxxxxxxxxxxxxxx
> >>>> Subject: Re: [PATCH 1/2] iommu/vt-d: Support batching
> >>>> IOTLB/dev-IOTLB invalidation commands
> >>>>
> >>>> On 5/17/24 8:37 AM, Tina Zhang wrote:
> >>>>> Introduce a new parameter batch_desc to the QI based
> >>>>> IOTLB/dev-IOTLB invalidation operations to support batching invalidation
> descriptors.
> >>>>> This batch_desc is a pointer to the descriptor entry in a batch
> >>>>> cmds buffer. If the batch_desc is NULL, it indicates that batch
> >>>>> submission is not being used, and descriptors will be submitted
> individually.
> >>>>>
> >>>>> Also fix an issue reported by checkpatch about "unsigned mask":
> >>>>> "Prefer 'unsigned int' to bare use of 'unsigned'"
> >>>>>
> >>>>> Signed-off-by: Tina Zhang<tina.zhang@xxxxxxxxx>
> >>>>> ---
> >>>>> drivers/iommu/intel/cache.c | 33 +++++++++++-------
> >>>>> drivers/iommu/intel/dmar.c | 67 ++++++++++++++++++++---------------
> --
> >>>>> drivers/iommu/intel/iommu.c | 27 +++++++++------
> >>>>> drivers/iommu/intel/iommu.h | 21 ++++++++----
> >>>>> drivers/iommu/intel/pasid.c | 20 ++++++-----
> >>>>> 5 files changed, 100 insertions(+), 68 deletions(-)
> >>>>>
> >>>>> diff --git a/drivers/iommu/intel/cache.c
> >>>>> b/drivers/iommu/intel/cache.c index e8418cdd8331..dcf5e0e6af17
> >>>>> 100644
> >>>>> --- a/drivers/iommu/intel/cache.c
> >>>>> +++ b/drivers/iommu/intel/cache.c
> >>>>> @@ -278,7 +278,7 @@ void cache_tag_flush_range(struct dmar_domain
> >>>> *domain, unsigned long start,
> >>>>> case CACHE_TAG_NESTING_IOTLB:
> >>>>> if (domain->use_first_level) {
> >>>>> qi_flush_piotlb(iommu, tag-
> >domain_id,
> >>>>> - tag->pasid, addr, pages, ih);
> >>>>> + tag->pasid, addr, pages, ih,
> >>>> NULL);
> >>>>> } else {
> >>>> I'd like to have all batched descriptors code inside this file to
> >>>> make it easier for maintenance. Perhaps we can add the below
> >>>> infrastructure in the dmar_domain structure together with the cache tag.
> >>> Does it suggest we need to add a batch version of
> >> qi_flush_iotlb/qi_flush_dev_iotlb/qi_flush_piotlb/qi_flush_dev_iotlb_
> >> pasid() in the cache.c file? It doesn't sound like an easy to
> >> maintain those functions, does it?
> >>
> >> Yes. I don't think it's that difficult as the helpers just compose a
> >> qi descriptor and insert it in a local queue. This local queue will
> >> be flushed after finishing iterating all cache tags, or there's no
> >> room for more descriptors, or switches to a different iommu. Have I missed
> anything?
> > In current VT-d driver, only qi_flush_xxx() functions have the knowledge about
> how to make IOTLB invalidation descriptors. In qi_flush_xxx() functions, VT-d
> invalidation descriptors are populated and submitted to hardware immediately.
> >
> > To support batching command, I think we can have two choices:
> > 1. Extend qi_flush_xxx() functions to support batching descriptors.
> > (Just like the implementation in this version) In this way, the knowledge of
> populating an IOTLB invalidation descriptor in qi_flush_xxx() is reused.
> Additional code change is for batching the descriptor command into a buffer.
> >
> > 2. Introduce a new set of interfaces to populate IOTLB descriptors and batch
> them into a batch buffer.
> > The new set of interfaces is implemented in the cache.c file and we need to
> copy the knowledge about how to populate IOTLB descriptors from
> qi_flush_xxx() interfaces into the new interfaces. I hesitated to choose this
> option because it would duplicate code. Maybe we can generalize the
> knowledge of populating IOTLB descriptors into lower level interfaces and make
> the current qi_flush_xxx() and the new set of interfaces call them.
> >
> > Which option do you think is better?
>
> We can share the common code without changing the current helper interfaces.
> Something like this,
>
> static inline void qi_desc_iotlb(struct intel_iommu *iommu, u16 did, u64 addr,
> unsigned int size_order, u64 type,
> struct qi_desc *desc) {
> u8 dw = 0, dr = 0;
> int ih = 0;
>
> if (cap_write_drain(iommu->cap))
> dw = 1;
>
> if (cap_read_drain(iommu->cap))
> dr = 1;
>
> desc->qw0 = QI_IOTLB_DID(did) | QI_IOTLB_DR(dr) | QI_IOTLB_DW(dw)
> | QI_IOTLB_GRAN(type) | QI_IOTLB_TYPE;
> desc->qw1 = QI_IOTLB_ADDR(addr) | QI_IOTLB_IH(ih)
> | QI_IOTLB_AM(size_order);
> desc->qw2 = 0;
> desc->qw3 = 0;
> }
>
> void qi_flush_iotlb(struct intel_iommu *iommu, u16 did, u64 addr,
> unsigned int size_order, u64 type) {
> struct qi_desc desc;
>
> qi_desc_iotlb(iommu, did, addr, size_order, type, &desc)
> qi_submit_sync(iommu, &desc, 1, 0); }
>
> The qi_desc_iotlb() can be used in both batched and non-batched paths.
> Does this work for you?
It works. Thanks.

Regards,
-Tina
>
> Best regards,
> baolu