Re: Memory hotplug regression in 4.13

From: Michal Hocko
Date: Mon Sep 25 2017 - 08:58:32 EST


On Thu 21-09-17 00:40:34, Seth Forshee wrote:
> On Wed, Sep 20, 2017 at 11:29:31AM +0200, Michal Hocko wrote:
> > Hi,
> > I am currently at a conference so I will most probably get to this next
> > week but I will try to ASAP.
> >
> > On Tue 19-09-17 11:41:14, Seth Forshee wrote:
> > > Hi Michal,
> > >
> > > I'm seeing oopses in various locations when hotplugging memory in an x86
> > > vm while running a 32-bit kernel. The config I'm using is attached. To
> > > reproduce I'm using kvm with the memory options "-m
> > > size=512M,slots=3,maxmem=2G". Then in the qemu monitor I run:
> > >
> > > object_add memory-backend-ram,id=mem1,size=512M
> > > device_add pc-dimm,id=dimm1,memdev=mem1
> > >
> > > Not long after that I'll see an oops, not always in the same location
> > > but most often in wp_page_copy, like this one:
> >
> > This is rather surprising. How do you online the memory?
>
> The kernel has CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE=y.

OK, so the memory gets online automagically at the time when it is
hotadded. Could you send the full dmesg?

> > > [ 24.673623] BUG: unable to handle kernel paging request at dffff000
> > > [ 24.675569] IP: wp_page_copy+0xa8/0x660
> >
> > could you resolve the IP into the source line?
>
> It seems I don't have that kernel anymore, but I've got a 4.14-rc1 build
> and the problem still occurs there. It's pointing to the call to
> __builtin_memcpy in memcpy (include/linux/string.h line 340), which we
> get to via wp_page_copy -> cow_user_page -> copy_user_highpage.

Hmm, this is interesting. That would mean that we have successfully
mapped the destination page but its memory is still not accessible.

Right now I do not see how the patch you have bisected to could make any
difference because it only postponed the onlining to be independent but
your config simply onlines automatically so there shouldn't be any
semantic change. Maybe there is some sort of off-by-one or something.

I will try to investigate some more. Do you think it would be possible
to configure kdump on your system and provide me with the vmcore in some
way?
--
Michal Hocko
SUSE Labs