Re: [PATCH v2 3/3] riscv: Fix crash when flushing executable ioremap regions

From: Alex Ghiti
Date: Thu Feb 20 2020 - 00:49:25 EST


Hi Jan,

On 2/16/20 2:56 PM, Alex Ghiti wrote:
On 2/16/20 11:05 AM, Jan Kiszka wrote:
On 16.02.20 15:41, Alex Ghiti wrote:
Hi Jan,

On 2/15/20 6:49 AM, Jan Kiszka wrote:
From: Jan Kiszka <jan.kiszka@xxxxxxxxxxx>

Those are not backed by page structs, and pte_page is returning an
invalid pointer.

Signed-off-by: Jan Kiszka <jan.kiszka@xxxxxxxxxxx>
=2D--
  arch/riscv/mm/cacheflush.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
index 8930ab7278e6..9ee2c1a387cc 100644
=2D-- a/arch/riscv/mm/cacheflush.c
+++ b/arch/riscv/mm/cacheflush.c
@@ -84,7 +84,8 @@ void flush_icache_pte(pte_t pte)
  {
      struct page *page =3D pte_page(pte);

-    if (!test_and_set_bit(PG_dcache_clean, &page->flags))
+    if (!pfn_valid(pte_pfn(pte)) ||
+        !test_and_set_bit(PG_dcache_clean, &page->flags))
          flush_icache_all();
  }
  #endif /* CONFIG_MMU */
=2D-
2.16.4



When did you encounter such a situation ? i.e. executable code that is
not backed by struct page ?

Riscv uses the generic implementation of ioremap and the way
_PAGE_IOREMAP is defined does not allow to map executable memory region
using ioremap, so I'm interested to understand how we end up in
flush_icache_pte for an executable region not backed by any struct page.

You can create executable mappings of memory that Linux does not
initially consider as RAM via ioremap_prot or ioremap_page_range. We are
using that in Jailhouse to load the hypervisor code into reserved memory
that is ioremapped for the purpose. Works fine on x86, arm and arm64.

Jan

Ok thanks, I had missed this API.

Regarding your patch, I find it weird to do anything if the pfn is invalid, we could have garbage in pte pointing to an invalid region for example (I admit that the effect of flushing the icache would not be catastrophic in that situation).

I'm not saying I will come with a better solution but I'll take a deeper look tomorrow.

Alex


I took a look at the Jailhouse driver. After loading the hypervisor into the ioremapped region, it explicitly ensures icache/dcache consistency by calling flush_icache_range here:

https://github.com/siemens/jailhouse/blob/master/driver/main.c#L505

There seems to be an implicit (?) rule that states that in-kernel code modification must handle icache/dcache consistency:

In arm64 set_pte_at definition, they do not sync icache/dcache when the pte is kernel:

https://elixir.bootlin.com/linux/latest/source/arch/arm64/include/asm/pgtable.h#L271

In mips, they do the same:

https://elixir.bootlin.com/linux/latest/source/arch/mips/mm/cache.c#L137

So funnily, I'd do the contrary of what you have done, the mips way:

diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
index 8930ab7278e6..c90c8bb49109 100644
--- a/arch/riscv/mm/cacheflush.c
+++ b/arch/riscv/mm/cacheflush.c
@@ -84,6 +84,9 @@ void flush_icache_pte(pte_t pte)
{
struct page *page = pte_page(pte);

+ if (unlikely(!pfn_valid(pte_pfn(pte))))
+ return;
+
if (!test_and_set_bit(PG_dcache_clean, &page->flags))
flush_icache_all();
}

What do you think ?

Alex