Re: [PATCH] iommu/vt-d: reduce extra first level entry in iommu->domains

From: Wei Yang
Date: Thu May 26 2016 - 18:34:16 EST

Next message: Jeff Chua: "Re: can't boot with reiserfs on linux-4.6.0+"
Previous message: Al Stone: "Re: [PATCH v5 0/1] ARM64: ACPI: Update documentation for latest specification version"
In reply to: Robin Murphy: "Re: [PATCH] iommu/vt-d: reduce extra first level entry in iommu->domains"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Thu, May 26, 2016 at 11:11:51AM +0100, Robin Murphy wrote:
>On 25/05/16 22:43, Wei Yang wrote:
>>On Wed, May 25, 2016 at 11:17:49AM +0100, Robin Murphy wrote:
>>>On 25/05/16 00:06, Wei Yang wrote:
>>>>Hi, Joerg
>>>>
>>>>Not sure whether you think this calculation is correct.
>>>>
>>>>If I missed something for this " + 1" in your formula, I am glad to hear your
>>>>explanation. So that I could learn something from you :-)
>>>
>>>I'm not familiar enough with this aspect of the driver to confirm whether the
>>>change is appropriate or not, but it does seem worth noting that using
>>>DIV_ROUND_UP would be an even neater approach.
>>>
>>
>>Hi, Robin,
>>
>>Thanks for your comment.
>>
>>Yes, I agree DIV_ROUND_UP would make the code more easy to read.
>>
>>I have thought about using DIV_ROUND_UP, while from the definition
>>DIV_ROUND_UP use operation "/", and ALIGN use bit operation. So the change in
>>my patch chooses the second one and tries to keep the efficiency.
>
>The efficiency of what, though?
>
>It's an unsigned division by a constant power of two, which GCC implements
>with a shift instruction regardless of optimisation - and at -O1 and above
>the machine code generated for either form of expression is completely
>identical (try it and see!).
>

Thanks.

Looks my knowledge of the compiler is an ancient one :-)

I haven't thought about to compare the generated code. This is really a good
test before making the decision. Next time I would try this before choosing
one.

>On the other hand, the small amount of time and cognitive effort it took to
>parse "ALIGN(x, 256) >> 8" as "divide by 256, rounding up" compared to simply
>seeing "DIV_ROUND_UP(x, 256)" and knowing instantly what's intended,
>certainly makes it less efficient to _maintain_; thus it's exactly the kind
>of thing to which Dijkstra's famous quotation applies.
>
>Does that count towards learning something? ;)
>

Really~

I am really happy to see your comments which help me to be more mature on the
solution. I owe you a favor :-) If you would come to Shanghai, I would like to
take you around~

>Robin.
>
>>>>Have a good day~
>>>>
>>>>On Sat, May 21, 2016 at 02:41:51AM +0000, Wei Yang wrote:
>>>>>In commit <8bf478163e69> ("iommu/vt-d: Split up iommu->domains array"), it
>>>>>it splits iommu->domains in two levels. Each first level contains 256
>>>>>entries of second level. In case of the ndomains is exact a multiple of
>>>>>256, it would have one more extra first level entry for current
>>>>>implementation.
>>>>>
>>>>>This patch refines this calculation to reduce the extra first level entry.
>>>>>
>>>>>Signed-off-by: Wei Yang <richard.weiyang@xxxxxxxxx>
>>>>>---
>>>>>drivers/iommu/intel-iommu.c | 4 ++--
>>>>>1 file changed, 2 insertions(+), 2 deletions(-)
>>>>>
>>>>>diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>>>>>index e3061d3..2204ca4 100644
>>>>>--- a/drivers/iommu/intel-iommu.c
>>>>>+++ b/drivers/iommu/intel-iommu.c
>>>>>@@ -1634,7 +1634,7 @@ static int iommu_init_domains(struct intel_iommu *iommu)
>>>>> return -ENOMEM;
>>>>> }
>>>>>
>>>>>- size = ((ndomains >> 8) + 1) * sizeof(struct dmar_domain **);
>>>>>+ size = (ALIGN(ndomains, 256) >> 8) * sizeof(struct dmar_domain **);
>>>>> iommu->domains = kzalloc(size, GFP_KERNEL);
>>>>>
>>>>> if (iommu->domains) {
>>>>>@@ -1699,7 +1699,7 @@ static void disable_dmar_iommu(struct intel_iommu *iommu)
>>>>>static void free_dmar_iommu(struct intel_iommu *iommu)
>>>>>{
>>>>> if ((iommu->domains) && (iommu->domain_ids)) {
>>>>>- int elems = (cap_ndoms(iommu->cap) >> 8) + 1;
>>>>>+ int elems = ALIGN(cap_ndoms(iommu->cap), 256) >> 8;
>>>>> int i;
>>>>>
>>>>> for (i = 0; i < elems; i++)
>>>>>--
>>>>>1.7.9.5
>>>>
>>

--
Wei Yang
Help you, Help me

Next message: Jeff Chua: "Re: can't boot with reiserfs on linux-4.6.0+"
Previous message: Al Stone: "Re: [PATCH v5 0/1] ARM64: ACPI: Update documentation for latest specification version"
In reply to: Robin Murphy: "Re: [PATCH] iommu/vt-d: reduce extra first level entry in iommu->domains"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]