-----Original Message-----
From: Baoquan He [mailto:bhe@xxxxxxxxxx]
Sent: Wednesday, May 16, 2018 11:18 PM
To: Travis, Mike <mike.travis@xxxxxxx>; Anderson, Russ
<russ.anderson@xxxxxxx>; Ramsay, Frank <frank.ramsay@xxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx;
x86@xxxxxxxxxx; mingo@xxxxxxxxxx; tglx@xxxxxxxxxxxxx; hpa@xxxxxxxxx;
thgarnie@xxxxxxxxxx; keescook@xxxxxxxxxxxx; akpm@linux-
foundation.org; yamada.masahiro@xxxxxxxxxxxxx; Sivanich, Dimitri
<dimitri.sivanich@xxxxxxx>; dyoung@xxxxxxxxxx
Subject: Re: [PATCH v2 RESEND 2/2] x86/mm/KASLR: Do not adapt the size of
the direct mapping section for SGI UV system
Hi Mike, Russ and Frank,
On 09/28/17 at 07:10am, Mike Travis wrote:
kernel_randomize_memory(void)
On 9/28/2017 2:01 AM, Ingo Molnar wrote:
* Baoquan He <bhe@xxxxxxxxxx> wrote:
@@ -123,7 +124,7 @@ void __init
CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
memory *//* Adapt phyiscal memory region size based on available
architecture.- if (memory_tb < kaslr_regions[0].size_tb)This is really an ugly hack. Is kaslr_regions[] incorrect? If so
+ if (memory_tb < kaslr_regions[0].size_tb &&
+!is_early_uv_system())
kaslr_regions[0].size_tb = memory_tb;
then it should be corrected instead of uglifying the code that uses it...
Thanks for looking into this!
If on SGI UV system, the kaslr_regions[0].size_tb, namely the size
of the direct mapping section, is incorrect.
Its direct mapping size includes two parts:
#1 RAM size of system
#2 MMIOH region size which only SGI UV system has.
However, the #2 can only be got till uv_system_init() is called in
native_smp_prepare_cpus(). That is too late for mm KASLR calculation.
That's why I made this hack.
I checked uv_system_init() code, seems not easy to know the size
of MMIOH region before or inside kernel_randomize_memory(). I have
CCed UV devel experts, not sure if they have any idea about this.
Otherwise, this patch could be the only way I can think of.
Hi Mike and Russ,
Is there any chance we can get the size of MMIOH region before mm
KASLR code, namely before we call kernel_randomize_memory()?
The sizes of the MMIOL and MMIOH areas are tied into the HUB design
and how it is communicated to BIOS and the kernel. This is via some
of the config MMR's found in the HUB and it would be impossible to
provide any access to these registers as they change with each new UV
The kernel does reserve the memory in the EFI memmap. I can send you
a console log of the full startup that includes the MMIOH
reservations. Note that it is dependent on what I/O devices are
actually present as UV does not map empty slots unless forced (because
we'd quickly run out of resources.) Also, the EFI memmap entries do
not specify the exact usage of the contained areas.
This one is still a regression bug in our newer rhel since I just fixed them with
rhel-only patch. Now I still need the console log which includes the MMIOH
reservations.
Does the system need to have an external IO device for this? If not you should just be able to boot one of the SGI UV systems in the beaker lab (possibly also the HPE Superdome Flex that is in beaker; hpe-flex-01.rhts.eng.bos.redhat.com)
Could you help provide a console log with MMIOH info, or I need request
one from redhat's lab?
Or expert from HPE UV team can make a patch based on the finding and
analysis?
Thanks
Baoquan
on UV systems.
I don't mind system specific quirks to hardware enumeration details,
as long as they don't pollute generic code with such special hacks.
I.e. in this case it's wrong to allow kaslr_regions[0].size_tb to be
wrong. Any other code that relies on it in the future will be wrong as well
memory
Which may come into play on other arches with the new upcoming
technologies.
The right quirk would be to fix that up where it gets introduced, or
something like that.
Yes, does make sense.
Thanks,
Ingo