Regression seen when using MADV_FREE vs MADV_DONTNEED

From: Alexander Duyck
Date: Sat Dec 21 2019 - 13:26:15 EST


In v15 of my patch set which can be found here
(https://lore.kernel.org/lkml/20191205161928.19548.41654.stgit@xxxxxxxxxxxxxxxxxxxxx/)
I had introduced an RFC patch that used MADV_FREE in QEMU instead of
MADV_DONTNEED. When testing that I had used a next-20191120 kernel on
the host. When preparing the numbers for my latest version I had
updated the host to next-20191219, and that is where I encountered an
issue where MADV_FREE is significantly slower than MADV_DONTNEED when
used to report the pages from the QEMU to the kernel and then
eventually fault them back into the guest. No regression was seen with
MADV_DONTNEED.

I just wanted to put it out there that it looks like something has
added spinlock overhead as high as 60% for 16 cores using MADV_FREE to
notify the system that a given transparent huge page isn't needed, and
then eventually faulting the memory back in. I'll try to bisect this
as time permits, but just thought I would put this out there in case
somebody has already found something similar and gotten root cause.

Thanks.

- Alex