Re: [PATCH] mm/hotplug: Only respect mem= parameter during boot stage

From: Baoquan He
Date: Tue Dec 10 2019 - 05:43:15 EST


On 12/10/19 at 11:28am, Michal Hocko wrote:
> On Tue 10-12-19 15:24:53, Baoquan He wrote:
> > On 12/09/19 at 11:07am, Michal Hocko wrote:
> > > On Fri 06-12-19 23:05:24, Baoquan He wrote:
> > > > In commit 357b4da50a62 ("x86: respect memory size limiting via mem=
> > > > parameter") a global varialbe global max_mem_size is added to store
> > > > the value which is parsed from 'mem= '. This truly stops those
> > > > DIMM from being added into system memory during boot.
> > > >
> > > > However, it also limits the later memory hotplug functionality. Any
> > > > memory board can't be hot added any more if its region is beyond the
> > > > max_mem_size. System will print error like below:
> > > >
> > > > [ 216.387164] acpi PNP0C80:02: add_memory failed
> > > > [ 216.389301] acpi PNP0C80:02: acpi_memory_enable_device() error
> > > > [ 216.392187] acpi PNP0C80:02: Enumeration failure
> > > >
> > > > >From document of 'mem =' parameter, it should be a restriction during
> > > > boot, but not impact the system memory adding/removing after booting.
> > > >
> > > > mem=nn[KMG] [KNL,BOOT] Force usage of a specific amount of memory
> > > >
> > > > So fix it by also checking if it's during SYSTEM_BOOTING stage when
> > > > restrict memory adding. Otherwise, skip the restriction.
> > >
> > > Could you be more specific about why the boot vs. later hotplug makes
> > > any difference? The documentation is explicit about the boot time but
> > > considering this seems to be like that since ever I strongly suspect
> > > that this is just an omission.
> >
> > I think the 'mem=' updating in commit 357b4da50a62 will only affect
> > those hotplugable memory regions. When I tested it, there are three
> > memmory boards, one is the normal memory region with 4G memory, and the
> > other two are hotpluggable memory boards and recorded in ACPI tables,
> > each is 1GB. When put all them three onsite before boot, they will be
> > recognized by firmware and written into e820 table and/or EFI table, then
> > kernel can read them from e820 and them as system memory, we get 6G
> > memory.
> >
> > However, if add 'mem=', like 'mme=3G', w/o commit 357b4da50a62, in e820,
> > we will only get 3G memory. Later in acpi_init(), acpi scanning will
> > search those two memory regions, and try to add them into system call
> > because the two hotpluggable memory boards are power on and ready.
> > Then we will get 3G + 1G + 1G, 5G memory. the 1st 3G is from the normal
> > memory board, its last 1G is trimmed. Jurgen's patch is trying to fix this
> > because the adding happens during boot time, and conflicts with 'mem='.
>
> Unless I misunderstand what you are saying this all is just expected.
> You have restricted the memory explicitly and the result is that not all
> the memory is visible.

Yes.

>
> > But after system bootup, we should be able to hot add/remove any memory
> > board. This should not be restricted by a boot-time kernel parameter
> > 'mme='. This is what I am trying to fix.
>
> This is a simple statement without any actual explanation on why. Why is
> hotplug memory special? What is the usecase? Who would want to use mem
> parameter and later expect a memory above the restrected area to be
> hotplugable?

The why is 'mem=' is used to restrict the amount of system ram during
boot. We have two ways to add system memory, one is installing DIMMs
before boot, the other is hot adding memory after boot. Without David's
use case, we may need redefine 'mem=' and change its documentation in
kernel-parameters.txt, if we don't want to fix it like this. Otherwise,
'mem=' will limit the system's upper system ram always, that is not
expected.

>
> David has provided an actual usecase [1] but this needs to be documented
> somewhere so that we do not break that accidentally in the future.
> Ideally both in code which adds the boot restriction and the kernel
> command line documentation to be explicit about BOOT restriction.
>
> [1] http://lkml.kernel.org/r/429622cf-f0f4-5d80-d39d-b0d8a6c6605f@xxxxxxxxxx

Yes, agree. As I replied in the v2 thread, I will add that into log, and
also will change text of 'mem=' in kernel-parameters.txt.

Thanks
Baoquan