Re: [PATCH] KVM: selftests: Detect max PA width from cpuid
From: Peter Xu
Date: Mon Aug 26 2019 - 07:39:21 EST
On Mon, Aug 26, 2019 at 06:47:57PM +0800, Peter Xu wrote:
> On Mon, Aug 26, 2019 at 10:25:55AM +0200, Vitaly Kuznetsov wrote:
> > Peter Xu <peterx@xxxxxxxxxx> writes:
> >
> > > The dirty_log_test is failing on some old machines like Xeon E3-1220
> > > with tripple faults when writting to the tracked memory region:
> >
> > s,writting,writing,
> >
> > >
> > > Test iterations: 32, interval: 10 (ms)
> > > Testing guest mode: PA-bits:52, VA-bits:48, 4K pages
> > > guest physical test memory offset: 0x7fbffef000
> > > ==== Test Assertion Failure ====
> > > dirty_log_test.c:138: false
> > > pid=6137 tid=6139 - Success
> > > 1 0x0000000000401ca1: vcpu_worker at dirty_log_test.c:138
> > > 2 0x00007f3dd9e392dd: ?? ??:0
> > > 3 0x00007f3dd9b6a132: ?? ??:0
> > > Invalid guest sync status: exit_reason=SHUTDOWN
> > >
> >
> > This patch breaks on my AMD machine with
> >
> > # cpuid -1 -l 0x80000008
> > CPU:
> > Physical Address and Linear Address Size (0x80000008/eax):
> > maximum physical address bits = 0x30 (48)
> > maximum linear (virtual) address bits = 0x30 (48)
> > maximum guest physical address bits = 0x0 (0)
> >
> >
> > Pre-patch:
> >
> > # ./dirty_log_test
> > Test iterations: 32, interval: 10 (ms)
> > Testing guest mode: PA-bits:52, VA-bits:48, 4K pages
> > guest physical test memory offset: 0x7fbffef000
> > Dirtied 139264 pages
> > Total bits checked: dirty (135251), clear (7991709), track_next (29789)
> >
> > Post-patch:
> >
> > # ./dirty_log_test
> > Test iterations: 32, interval: 10 (ms)
> > Testing guest mode: PA-bits:52, VA-bits:48, 4K pages
> > Supported guest physical address width: 48
> > guest physical test memory offset: 0xffffbffef000
> > ==== Test Assertion Failure ====
> > dirty_log_test.c:141: false
> > pid=77983 tid=77985 - Success
> > 1 0x0000000000401d12: vcpu_worker at dirty_log_test.c:138
> > 2 0x00007f636374358d: ?? ??:0
> > 3 0x00007f63636726a2: ?? ??:0
> > Invalid guest sync status: exit_reason=SHUTDOWN
>
> Vitaly,
>
> Are you using shadow paging? If so, could you try NPT=off?
Sorry, it should be s/shadow paging/NPT/...
[root@hp-dl385g10-10 peter]# ./dirty_log_test
Test iterations: 32, interval: 10 (ms)
Testing guest mode: PA-bits:52, VA-bits:48, 4K pages
Supported guest physical address width: 48
guest physical test memory offset: 0xffffbffef000
==== Test Assertion Failure ====
dirty_log_test.c:138: false
pid=5433 tid=5436 - Success
1 0x0000000000401cc1: vcpu_worker at dirty_log_test.c:138
2 0x00007f18977992dd: ?? ??:0
3 0x00007f18974ca132: ?? ??:0
Invalid guest sync status: exit_reason=SHUTDOWN
[root@hp-dl385g10-10 peter]# modprobe -r kvm_amd
[root@hp-dl385g10-10 peter]# modprobe kvm_amd npt=0
[root@hp-dl385g10-10 peter]# ./dirty_log_test
Test iterations: 32, interval: 10 (ms)
Testing guest mode: PA-bits:52, VA-bits:48, 4K pages
Supported guest physical address width: 48
guest physical test memory offset: 0xffffbffef000
Dirtied 102400 pages
Total bits checked: dirty (99021), clear (8027939), track_next (23425)
>
> I finally found a AMD host and I also found that it's passing with
> shadow MMU mode which is strange. If so I would suspect it's a real
> bug in AMD NTP path but I'd like to see whether it's also happening on
> your side.
>
> Thanks,
--
Peter Xu