Re: [PATCH v2 1/1] powerpc/iommu: Enable remaining IOMMU Pagesizes present in LoPAR

From: Alexey Kardashevskiy
Date: Thu Apr 08 2021 - 03:13:52 EST




On 08/04/2021 15:37, Michael Ellerman wrote:
Leonardo Bras <leobras.c@xxxxxxxxx> writes:
According to LoPAR, ibm,query-pe-dma-window output named "IO Page Sizes"
will let the OS know all possible pagesizes that can be used for creating a
new DDW.

Currently Linux will only try using 3 of the 8 available options:
4K, 64K and 16M. According to LoPAR, Hypervisor may also offer 32M, 64M,
128M, 256M and 16G.

Do we know of any hardware & hypervisor combination that will actually
give us bigger pages?


On P8 16MB host pages and 16MB hardware iommu pages worked.

On P9, VM's 16MB IOMMU pages worked on top of 2MB host pages + 2MB hardware IOMMU pages.



Enabling bigger pages would be interesting for direct mapping systems
with a lot of RAM, while using less TCE entries.

Signed-off-by: Leonardo Bras <leobras.c@xxxxxxxxx>
---
arch/powerpc/platforms/pseries/iommu.c | 49 ++++++++++++++++++++++----
1 file changed, 42 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
index 9fc5217f0c8e..6cda1c92597d 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -53,6 +53,20 @@ enum {
DDW_EXT_QUERY_OUT_SIZE = 2
};

A comment saying where the values come from would be good.

+#define QUERY_DDW_PGSIZE_4K 0x01
+#define QUERY_DDW_PGSIZE_64K 0x02
+#define QUERY_DDW_PGSIZE_16M 0x04
+#define QUERY_DDW_PGSIZE_32M 0x08
+#define QUERY_DDW_PGSIZE_64M 0x10
+#define QUERY_DDW_PGSIZE_128M 0x20
+#define QUERY_DDW_PGSIZE_256M 0x40
+#define QUERY_DDW_PGSIZE_16G 0x80

I'm not sure the #defines really gain us much vs just putting the
literal values in the array below?


Then someone says "uuuuu magic values" :) I do not mind either way. Thanks,



+struct iommu_ddw_pagesize {
+ u32 mask;
+ int shift;
+};
+
static struct iommu_table_group *iommu_pseries_alloc_group(int node)
{
struct iommu_table_group *table_group;
@@ -1099,6 +1113,31 @@ static void reset_dma_window(struct pci_dev *dev, struct device_node *par_dn)
ret);
}
+/* Returns page shift based on "IO Page Sizes" output at ibm,query-pe-dma-window. See LoPAR */
+static int iommu_get_page_shift(u32 query_page_size)
+{
+ const struct iommu_ddw_pagesize ddw_pagesize[] = {
+ { QUERY_DDW_PGSIZE_16G, __builtin_ctz(SZ_16G) },
+ { QUERY_DDW_PGSIZE_256M, __builtin_ctz(SZ_256M) },
+ { QUERY_DDW_PGSIZE_128M, __builtin_ctz(SZ_128M) },
+ { QUERY_DDW_PGSIZE_64M, __builtin_ctz(SZ_64M) },
+ { QUERY_DDW_PGSIZE_32M, __builtin_ctz(SZ_32M) },
+ { QUERY_DDW_PGSIZE_16M, __builtin_ctz(SZ_16M) },
+ { QUERY_DDW_PGSIZE_64K, __builtin_ctz(SZ_64K) },
+ { QUERY_DDW_PGSIZE_4K, __builtin_ctz(SZ_4K) }
+ };


cheers


--
Alexey