Re: [PATCH 2/3] powperc/mm: read TLB Block Invalidate Characteristics

From: Laurent Dufour
Date: Fri Sep 13 2019 - 09:55:51 EST

Le 12/09/2019 Ã 16:16, Aneesh Kumar K.V a ÃcritÂ:
On 8/30/19 5:37 PM, Laurent Dufour wrote:
The PAPR document specifies the TLB Block Invalidate Characteristics which
is telling which couple base page size / page size is supported by the

A new set of feature is added to the mmu_psize_def structure to record per
base page size which page size is supported by H_BLOCK_REMOVE.

A new init service is added to read the characteristics. The size of the
buffer is set to twice the number of known page size, plus 10 bytes to
ensure we have enough place.

So this is not really the base page size/actual page size combination. This is related to H_BLOCK_REMOVE hcall, block size supported by that HCALL and what page size combination is supported with that specific block size.

I agree

We should add that TLB block invalidate characteristics format in this patch.

Sure, will do that in a comment inside the code.

Signed-off-by: Laurent Dufour <ldufour@xxxxxxxxxxxxx>
 arch/powerpc/include/asm/book3s/64/mmu.h | 3 +
 arch/powerpc/platforms/pseries/lpar.c | 107 +++++++++++++++++++++++
 2 files changed, 110 insertions(+)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
index 23b83d3593e2..675895dfe39f 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -12,11 +12,14 @@
 * sllp : is a bit mask with the value of SLB L || LP to be or'ed
ÂÂ *ÂÂÂÂÂÂÂÂÂÂÂ directly to a slbmte "vsid" value
 * penc : is the HPTE encoding mask for the "LP" field:
+ * hblk : H_BLOCK_REMOVE supported block size for this page size in
+ *ÂÂÂÂÂÂÂÂÂÂÂ segment who's base page size is that page size.
ÂÂ *
ÂÂ */
 struct mmu_psize_def {
ÂÂÂÂÂ unsigned intÂÂÂ shift;ÂÂÂ /* number of bits */
ÂÂÂÂÂ unsigned intÂÂÂ tlbiel;ÂÂÂ /* tlbiel supported for that page size */
ÂÂÂÂÂ unsigned longÂÂÂ avpnm;ÂÂÂ /* bits to mask out in AVPN in the HPTE */
ÂÂÂÂÂ union {
diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
index 4f76e5f30c97..375e19b3cf53 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -1311,6 +1311,113 @@ static void do_block_remove(unsigned long number, struct ppc64_tlb_batch *batch,
ÂÂÂÂÂÂÂÂÂ (void)call_block_remove(pix, param, true);
+static inline void __init set_hblk_bloc_size(int bpsize, int psize,
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ unsigned int block_size)
+ÂÂÂ struct mmu_psize_def *def = &mmu_psize_defs[bpsize];
+ÂÂÂ if (block_size > def->hblk[psize])
+ÂÂÂÂÂÂÂ def->hblk[psize] = block_size;
+static inline void __init check_lp_set_hblk(unsigned int lp,
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ unsigned int block_size)
+ÂÂÂ unsigned int bpsize, psize;
+ÂÂÂ /* First, check the L bit, if not set, this means 4K */
+ÂÂÂ if ((lp & 0x80) == 0) {

What is that 0x80? We should have #define for most of those.

I will make that more explicit through a define

+ÂÂÂÂÂÂÂ set_hblk_bloc_size(MMU_PAGE_4K, MMU_PAGE_4K, block_size);
+ÂÂÂÂÂÂÂ return;
+ÂÂÂ }
+ÂÂÂ /* PAPR says to look at bits 2-7 (0 = MSB) */
+ÂÂÂ lp &= 0x3f;

Also convert that to #define?

Really ? The comment above is explicitly saying that we are looking at bits 2-7. A define will obfuscate that.

+ÂÂÂ for (bpsize = 0; bpsize < MMU_PAGE_COUNT; bpsize++) {
+ÂÂÂÂÂÂÂ struct mmu_psize_def *def =Â &mmu_psize_defs[bpsize];
+ÂÂÂÂÂÂÂ for (psize = 0; psize < MMU_PAGE_COUNT; psize++) {
+ÂÂÂÂÂÂÂÂÂÂÂ if (def->penc[psize] == lp) {
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ set_hblk_bloc_size(bpsize, psize, block_size);
+ÂÂÂ }
+static int __init read_tlbbi_characteristics(void)
+ÂÂÂ int call_status;
+ÂÂÂ unsigned char local_buffer[SPLPAR_TLB_BIC_MAXLENGTH];
+ÂÂÂ int len, idx, bpsize;
+ÂÂÂ if (!firmware_has_feature(FW_FEATURE_BLOCK_REMOVE)) {
+ÂÂÂÂÂÂÂ pr_info("H_BLOCK_REMOVE is not supported");
+ÂÂÂÂÂÂÂ return 0;
+ÂÂÂ }
+ÂÂÂ memset(local_buffer, 0, SPLPAR_TLB_BIC_MAXLENGTH);
+ÂÂÂ spin_lock(&rtas_data_buf_lock);
+ÂÂÂ memset(rtas_data_buf, 0, RTAS_DATA_BUF_SIZE);
+ÂÂÂ call_status = rtas_call(rtas_token("ibm,get-system-parameter"), 3, 1,
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ __pa(rtas_data_buf),
+ÂÂÂ memcpy(local_buffer, rtas_data_buf, SPLPAR_TLB_BIC_MAXLENGTH);
+ÂÂÂ local_buffer[SPLPAR_TLB_BIC_MAXLENGTH - 1] = '\0';
+ÂÂÂ spin_unlock(&rtas_data_buf_lock);
+ÂÂÂ if (call_status != 0) {
+ÂÂÂÂÂÂÂ pr_warn("%s %s Error calling get-system-parameter (0x%x)\n",
+ÂÂÂÂÂÂÂÂÂÂÂ __FILE__, __func__, call_status);
+ÂÂÂÂÂÂÂ return 0;
+ÂÂÂ }
+ÂÂÂ /*
+ÂÂÂÂ * The first two (2) bytes of the data in the buffer are the length of
+ÂÂÂÂ * the returned data, not counting these first two (2) bytes.
+ÂÂÂÂ */
+ÂÂÂ len = local_buffer[0] * 256 + local_buffer[1] + 2;
+ÂÂÂÂÂÂÂ pr_warn("%s too large returned buffer %d", __func__, len);
+ÂÂÂÂÂÂÂ return 0;
+ÂÂÂ }
+ÂÂÂ idx = 2;
+ÂÂÂ while (idx < len) {
+ÂÂÂÂÂÂÂ unsigned int block_size = local_buffer[idx++];
+ÂÂÂÂÂÂÂ unsigned int npsize;
+ÂÂÂÂÂÂÂ if (!block_size)
+ÂÂÂÂÂÂÂ block_size = 1 << block_size;
+ÂÂÂÂÂÂÂ if (block_size != 8)
+ÂÂÂÂÂÂÂÂÂÂÂ /* We only support 8 bytes size TLB invalidate buffer */
+ÂÂÂÂÂÂÂÂÂÂÂ pr_warn("Unsupported H_BLOCK_REMOVE block size : %d\n",
+ÂÂÂÂÂÂÂ for (npsize = local_buffer[idx++];Â npsize > 0; npsize--)
+ÂÂÂÂÂÂÂÂÂÂÂ check_lp_set_hblk((unsigned int) local_buffer[idx++],
+ÂÂÂ }
+ÂÂÂ for (bpsize = 0; bpsize < MMU_PAGE_COUNT; bpsize++)
+ÂÂÂÂÂÂÂ for (idx = 0; idx < MMU_PAGE_COUNT; idx++)
+ÂÂÂÂÂÂÂÂÂÂÂ if (mmu_psize_defs[bpsize].hblk[idx])
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ pr_info("H_BLOCK_REMOVE supports base psize:%d psize:%d block size:%d",
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ mmu_psize_defs[bpsize].hblk[idx]);
+ÂÂÂ return 0;
+machine_arch_initcall(pseries, read_tlbbi_characteristics);

Why a machine_arch_initcall() ? Can't we do this similar to how we do segment-page-size parsing from device tree? Also this should be hash translation mode specific.

Because that code is specific to the pseries architecture. the hash translation is not pseries specific.

Indeed the change in mmu_psize_defs is not too generic. The hblk characteristics should remain static to the lpar.c file where it is used.

ÂÂ * Take a spinlock around flushes to avoid bouncing the hypervisor tlbie
ÂÂ * lock.