Re: DAX can not work on virtual nvdimm device

From: Ross Zwisler
Date: Wed Aug 31 2016 - 12:49:58 EST

On Wed, Aug 31, 2016 at 04:44:47PM +0800, Xiao Guangrong wrote:
> On 08/31/2016 01:09 AM, Dan Williams wrote:
> >
> > Can you post your exact reproduction steps? This test is not failing for me.
> >
> Sure.
> 1. make the guest kernel based on your tree, the top commit is
> 10d7902fa0e82b (dax: unmap/truncate on device shutdown) and
> the config file can be found in this thread.
> 2. add guest kernel command line: memmap=6G!10G
> 3: start the guest:
> x86_64-softmmu/qemu-system-x86_64 -machine pc,nvdimm --enable-kvm \
> -smp 16 -m 32G,maxmem=100G,slots=100 /other/VMs/centos6.img -monitor stdio
> 4: in guest:
> mkfs.ext4 /dev/pmem0
> mount -o dax /dev/pmem0 /mnt/pmem/
> echo > /mnt/pmem/xxx
> ./mmap /mnt/pmem/xxx
> ./read /mnt/pmem/xxx
> The source code of mmap and read has been attached in this mail.
> Hopefully, you can detect the error triggered by read test.
> Thanks!

I'm still unable to reproduce this issue.

I'm using a version of QEMU that I compiled at this commit:

bfc766d (HEAD, tag: v2.6.0) Update version for v2.6.0 release

Here are the options I used for the compile:

./configure --prefix=/home/rzwisler/qemu --target-list=x86_64-softmmu
--enable-kvm --enable-spice --enable-libusb --enable-usb-redir

I used the kernel commit and kernel config you provided. The mmap is set up
the same, as are the QEMU command line parameters.

With all this, the tests you provided give the following output:

# ./mmap /mnt/pmem/xxx
mmap test on /mnt/pmem/xxx.
Try to write 0x7f160072d000 for 1000 size.
Write Done.
Try to read 0x7f160072d000 for 1000 size.
Read Done.
End: 1000.
Try to fread fd=3 size 1000 sizeof(buf) 1.
Fread Done.

# ./read /mnt/pmem/xxx
test on /mnt/pmem/xxx.
<snip a bunch of garbage read output>
Good Read.

I'm not sure what else to look at. What do you see in /proc/cpuinfo? Perhaps
our virtual machine CPUs are advertising different features, and we are going
down different code paths?

Here are my cpuinfo flags in my guest:

flags : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good nopl eagerfpu pni cx16
x2apic hypervisor lahf_lm

Another thing to do would be to run your test on bare metal on the same
machine and see if you get different results.

- Ross