set_memory_rw on the kernel text

From: Michal Hocko
Date: Tue Aug 11 2015 - 10:23:25 EST


[reposted with the correct address and name]

Hi Suresh,
I was debugging an issue that the kernel text didn't get remapped RW
after set_memory_rw and generated a #PF even though set_memory_rw
returned with success (0). I am completely unfamiliar with the code
but it become clear from the code inspection that static_protections()
will drop _PAGE_RW from the protection flags with CONFIG_DEBUG_RODATA
for the large mappings. try_preserve_large_page will then interpret this
as no change is needed and return with 0 all the way up to the caller.

I can see the point that set_memory_rw doesn't allow remapping after
certain moment (kernel_set_to_readonly is non-zero) but the current
semantic with returning success even though the operation was ignored is
strange.

Shouldn't the function return -EPERM instead? So that the caller doesn't
try to write to the address and #PF? Something like a completely
untested.

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 89af288ec674..c1fcb02f9662 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -540,6 +540,10 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
cpa->pfn = pfn;

new_prot = static_protections(req_prot, address, pfn);
+ if (pgprot_val(new_prot) ^ pgprot_val(req_prot)) {
+ do_split = -EPERM;
+ goto out_unlock;
+ }

/*
* We need to check the full range, whether

I am also trying to understand why the semantic is different for 4k
pages. I can see that pmd prot change might influence different sections
in the same pmd range or something like that but why don't we simply
split the pmd then and make the 4k page RW?
--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/