Re: [PATCH v6 14/14] riscv: Make mmap allocation top-down by default

From: Alex Ghiti
Date: Tue Oct 08 2019 - 07:58:30 EST


On 10/7/19 8:46 PM, Atish Patra wrote:
On Mon, 2019-10-07 at 05:11 -0400, Alex Ghiti wrote:
On 10/4/19 10:12 PM, Atish Patra wrote:
On Thu, 2019-08-08 at 02:17 -0400, Alexandre Ghiti wrote:
In order to avoid wasting user address space by using bottom-up
mmap
allocation scheme, prefer top-down scheme when possible.

Before:
root@qemuriscv64:~# cat /proc/self/maps
00010000-00016000 r-xp 00000000 fe:00
6389 /bin/cat.coreutils
00016000-00017000 r--p 00005000 fe:00
6389 /bin/cat.coreutils
00017000-00018000 rw-p 00006000 fe:00
6389 /bin/cat.coreutils
00018000-00039000 rw-p 00000000 00:00 0 [heap]
1555556000-155556d000 r-xp 00000000 fe:00 7193 /lib/ld-2.28.so
155556d000-155556e000 r--p 00016000 fe:00 7193 /lib/ld-2.28.so
155556e000-155556f000 rw-p 00017000 fe:00 7193 /lib/ld-2.28.so
155556f000-1555570000 rw-p 00000000 00:00 0
1555570000-1555572000 r-xp 00000000 00:00 0 [vdso]
1555574000-1555576000 rw-p 00000000 00:00 0
1555576000-1555674000 r-xp 00000000 fe:00 7187 /lib/libc-
2.28.so
1555674000-1555678000 r--p 000fd000 fe:00 7187 /lib/libc-
2.28.so
1555678000-155567a000 rw-p 00101000 fe:00 7187 /lib/libc-
2.28.so
155567a000-15556a0000 rw-p 00000000 00:00 0
3fffb90000-3fffbb1000 rw-p 00000000 00:00 0 [stack]

After:
root@qemuriscv64:~# cat /proc/self/maps
00010000-00016000 r-xp 00000000 fe:00
6389 /bin/cat.coreutils
00016000-00017000 r--p 00005000 fe:00
6389 /bin/cat.coreutils
00017000-00018000 rw-p 00006000 fe:00
6389 /bin/cat.coreutils
2de81000-2dea2000 rw-p 00000000 00:00 0 [heap]
3ff7eb6000-3ff7ed8000 rw-p 00000000 00:00 0
3ff7ed8000-3ff7fd6000 r-xp 00000000 fe:00 7187 /lib/libc-
2.28.so
3ff7fd6000-3ff7fda000 r--p 000fd000 fe:00 7187 /lib/libc-
2.28.so
3ff7fda000-3ff7fdc000 rw-p 00101000 fe:00 7187 /lib/libc-
2.28.so
3ff7fdc000-3ff7fe2000 rw-p 00000000 00:00 0
3ff7fe4000-3ff7fe6000 r-xp 00000000 00:00 0 [vdso]
3ff7fe6000-3ff7ffd000 r-xp 00000000 fe:00 7193 /lib/ld-2.28.so
3ff7ffd000-3ff7ffe000 r--p 00016000 fe:00 7193 /lib/ld-2.28.so
3ff7ffe000-3ff7fff000 rw-p 00017000 fe:00 7193 /lib/ld-2.28.so
3ff7fff000-3ff8000000 rw-p 00000000 00:00 0
3fff888000-3fff8a9000 rw-p 00000000 00:00 0 [stack]

Signed-off-by: Alexandre Ghiti <alex@xxxxxxxx>
Acked-by: Paul Walmsley <paul.walmsley@xxxxxxxxxx>
Reviewed-by: Christoph Hellwig <hch@xxxxxx>
Reviewed-by: Kees Cook <keescook@xxxxxxxxxxxx>
Reviewed-by: Luis Chamberlain <mcgrof@xxxxxxxxxx>
---
arch/riscv/Kconfig | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 59a4727ecd6c..87dc5370becb 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -54,6 +54,18 @@ config RISCV
select EDAC_SUPPORT
select ARCH_HAS_GIGANTIC_PAGE
select ARCH_WANT_HUGE_PMD_SHARE if 64BIT
+ select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT if MMU
+ select HAVE_ARCH_MMAP_RND_BITS
+
+config ARCH_MMAP_RND_BITS_MIN
+ default 18 if 6legacy_va_layout4BIT
+ default 8
+
+# max bits determined by the following formula:
+# VA_BITS - PAGE_SHIFT - 3
+config ARCH_MMAP_RND_BITS_MAX
+ default 24 if 64BIT # SV39 based
+ default 17
config MMU
def_bool y
With this patch, I am not able to boot a Fedora Linux(a Gnome
desktop
image) on RISC-V hardware (Unleashed + Microsemi Expansion board).
The
booting gets stuck right after systemd starts.

https://paste.fedoraproject.org/paste/TOrUMqqKH-pGFX7CnfajDg

Reverting just this patch allow to boot Fedora successfully on
specific
RISC-V hardware. I have not root caused the issue but it looks like
it
might have messed userpsace mapping.
It might have messed userspace mapping but not enough to make
userspace
completely broken
as systemd does some things. I would try to boot in legacy layout:
if
you can try to set sysctl legacy_va_layout
at boottime, it will map userspace as it was before (bottom-up). If
that
does not work, the problem could
be the randomization that is activated by default now.
Randomization may not be the issue. I just removed
ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT from the config and that seems to
work. Here is the bottom-up layout with randomization on.

Oups, sorry for my previous answer, I missed yours that landed in another folder.

Removing ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT also removes randomization
as this config selects ARCH_HAS_ELF_RANDOMIZE.
You could remove ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT and selects by hand
ARCH_HAS_ELF_RANDOMIZE but you would have to implement arch_mmap_rnd and
arch_randomize_brk (elf-randomize.h).

The simplest would be to boot in legacy layout: I did not find a way to set this in kernel
command line, but you can by modifying it directly in the code:

https://elixir.bootlin.com/linux/v5.4-rc2/source/kernel/sysctl.c#L269

[root@fedora-riscv ~]# cat /proc/self/maps
1555556000-1555570000 r-xp 00000000 103:01
280098 /usr/lib64/ld-2.28.so
1555570000-1555571000 r--p 00019000 103:01
280098 /usr/lib64/ld-2.28.so
1555571000-1555572000 rw-p 0001a000 103:01
280098 /usr/lib64/ld-2.28.so
1555572000-1555573000 rw-p 00000000 00:00 0
1555573000-1555575000 r-xp 00000000 00:00
0 [vdso]
1555575000-1555576000 r--p 00000000 103:01
50936 /usr/lib/locale/en_US.utf8/LC_IDENTIFICAT
ION
1555576000-155557d000 r--s 00000000 103:01
280826 /usr/lib64/gconv/gconv-modules.cache
155557d000-155557e000 r--p 00000000 103:01
50937 /usr/lib/locale/en_US.utf8/LC_MEASUREMENT
155557e000-155557f000 r--p 00000000 103:01
50939 /usr/lib/locale/en_US.utf8/LC_TELEPHONE
155557f000-1555580000 r--p 00000000 103:01
3706 /usr/lib/locale/en_US.utf8/LC_ADDRESS
1555580000-1555581000 r--p 00000000 103:01
50944 /usr/lib/locale/en_US.utf8/LC_NAME
1555581000-1555582000 r--p 00000000 103:01
3775 /usr/lib/locale/en_US.utf8/LC_PAPER
1555582000-1555583000 r--p 00000000 103:01
3758 /usr/lib/locale/en_US.utf8/LC_MESSAGES/SY
S_LC_MESSAGES
1555583000-1555584000 r--p 00000000 103:01
50938 /usr/lib/locale/en_US.utf8/LC_MONETARY
1555584000-1555585000 r--p 00000000 103:01
50940 /usr/lib/locale/en_US.utf8/LC_TIME
1555585000-1555586000 r--p 00000000 103:01
50945 /usr/lib/locale/en_US.utf8/LC_NUMERIC
1555590000-1555592000 rw-p 00000000 00:00 0
1555592000-15556b1000 r-xp 00000000 103:01
280105 /usr/lib64/libc-2.28.so
15556b1000-15556b5000 r--p 0011e000 103:01
280105 /usr/lib64/libc-2.28.so
15556b5000-15556b7000 rw-p 00122000 103:01
280105 /usr/lib64/libc-2.28.so
15556b7000-15556bb000 rw-p 00000000 00:00 0
15556bb000-1555933000 r--p 00000000 103:01
3755 /usr/lib/locale/en_US.utf8/LC_COLLATE
1555933000-1555986000 r--p 00000000 103:01
50942 /usr/lib/locale/en_US.utf8/LC_CTYPE
1555986000-15559a8000 rw-p 00000000 00:00 0
2aaaaaa000-2aaaab1000 r-xp 00000000 103:01
283975 /usr/bin/cat
2aaaab1000-2aaaab2000 r--p 00006000 103:01
283975 /usr/bin/cat
2aaaab2000-2aaaab3000 rw-p 00007000 103:01
283975 /usr/bin/cat
2aaaab3000-2aaaad4000 rw-p 00000000 00:00
0 [heap]
3fffc97000-3fffcb8000 rw-p 00000000 00:00
0 [stack]


Anyway, it's weird since userspace should not depend on how the
mapping is.

If you can identify the program that stalls, that would be fantastic
:)

It stucks while booting. So I am not sure how to figure out which
program stalls. It is difficult to figure out from boot log as it
stucks at different places but soon after systemd starts.

If you can attach the running kernel, I would use vmlinux-gdb.py commands
to figure out which processes are running (lx-ps command in particular could
give us a hint). You can also add traces directly in the kernel and either use
lx-dmesg command to print them from gdb or use your standard serial output:
I would then print task_struct->comm at context switch to see which process
is stuck.
To use the python script, you need to recompile with DEBUG_INFO and
GDB_SCRIPTS enabled.

FYI, I have just booted a custom buildroot image based on kernel 5.4-rc2.

Let me know if I can do anything.

Alex

As the code is common to mips and arm now and I did not hear from
them,
I imagine the problem comes
from us.

Alex