Re: [pmem_attach_disk] WARNING: CPU: 46 PID: 518 at kernel/memremap.c:363 devm_memremap_pages+0x350/0x4b0

From: Fengguang Wu
Date: Tue Oct 31 2017 - 03:08:29 EST


CC Ying and Aaron for Dan's tips on nvdimm testing.

On Mon, Oct 30, 2017 at 05:24:42PM -0700, Dan Williams wrote:
On Mon, Oct 30, 2017 at 5:00 PM, Fengguang Wu <fengguang.wu@xxxxxxxxx> wrote:
Hi Dan,

On Mon, Oct 30, 2017 at 08:59:46AM -0700, Dan Williams wrote:

On Mon, Oct 30, 2017 at 12:40 AM, Fengguang Wu <fengguang.wu@xxxxxxxxx>
wrote:


CC nvdimm maintainers.

On Sun, Oct 29, 2017 at 11:51:55PM +0100, Fengguang Wu wrote:


Hi Linus,

Up to now we see the below boot error/warnings when testing v4.14-rc6.

They hit the RC release mainly due to various imperfections in 0day's
auto bisection. So I manually list them here and CC the likely easy to
debug ones to the corresponding maintainers in the followup emails.

boot_successes: 4700
boot_failures: 247



[...]

WARNING:at_kernel/memremap.c:#devm_memremap_pages: 1



Bisect failed, hope it's not hard to debug:

Start
[ 18.989316] devm_memremap_pages attempted on mixed region [mem
0x680000000-0x103dffffff flags 0x200]



This appears to be a problem in the test environment. "Persistent
Memory" can only be specified on a minimum of a 128MB boundary if it
intersects "System RAM". Assuming I did my math right this appears to
end on 16MB boundary. Fixing this problem in the kernel would require
this patch set:

"mm: sub-section memory hotplug support":
https://lwn.net/Articles/707908/


Good to know that!

...but I have abandoned / pushed that to the back of my queue since
BIOS induced version of this problem does not appear to trigger in
practice. I assume this test is using memmap=ss!nn?


Yes, we used

memmap=104G!26G memmap=104G!154G
The warning showed up only once -- attached is the full dmesg.


Something is going wrong with memmap= because you are not getting 1G
aligned address ranges. I think you would have better luck switching
to the official nvdimm emulation in qemu-kvm rather than relying on
memmap= which is just a fragile / unreliable interface. In fact we
should look to deprecate it and point everyone to use the standard
methods. We just have a problem of legacy pre-ACPI6 platforms that
have no other way than a kernel command line to identify persistent
memory ranges.

Got it. Thank you for the tips!

Thanks,
Fengguang