Re: [net/bpf] 3051bf36c2 BUG: unable to handle kernel paging request at 0000a7cf

From: Linus Torvalds
Date: Thu Mar 09 2017 - 12:48:53 EST


On Thu, Mar 9, 2017 at 6:53 AM, Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote:
>
> Fwiw, I tried switching from using cr4
> (__native_flush_tlb_global_irq_disabled())
> to slower cr3 (__native_flush_tlb()) in "-cpu kvm64" mode, and it looks like
> it also lets all test cases pass (rodata_test, test_setmem, test_bpf), no
> corruption happening, etc.

Ok. I think this is conclusive: the qemu "-cpu kvm64" case is
definitely broken, since changing CR4.PGE is definitely
architecturally defined to flush all TLB entries.

This is not a guest kernel bug.

Of course, the bug may still be in the *host* kernel. Maybe the
emulation does something wrong. I see

if (((cr4 ^ old_cr4) & pdptr_bits) ||
(!(cr4 & X86_CR4_PCIDE) && (old_cr4 & X86_CR4_PCIDE)))
kvm_mmu_reset_context(vcpu);

(where pdptr_bits includes the PGE bit), but I'm not sure if emulation
is supposed to do something else too.

Linus