Re: [PATCH REBASE v2 0/2] Hugetlbfs support for riscv

From: Alex Ghiti
Date: Sun Jun 30 2019 - 11:31:52 EST



On 6/13/19 1:18 AM, Alex Ghiti wrote:
Hi Paul, Palmer,

Now Ingo and Catalin have acked their part, do you consider this patch
for inclusion ?

Thanks,

Alex


Hi Paul, Palmer,

Any thought about this patch ?

Thanks,

Alex


On 5/26/19 8:50 AM, Alexandre Ghiti wrote:
This series is simply rebased on v5.2rc1 and I added the Reviewed-By
from Palmer for the first patch, thanks for that.

This series introduces hugetlbfs support for both riscv 32/64. Riscv32
is architecturally limited to huge pages of size 4MB whereas riscv64 has
2MB/1G huge pages support. Transparent huge page support is not
implemented here, I will submit another series later.
As stated in "The RISC-V Instruction Set Manual, Volume II: Privileged
Architecture", riscv page table entries are marked as non-leaf entries
as soon as at least one of the R/W/X bit set:
- pmd_huge/pud_huge check if one of those bits are set,
- pte_mkhuge simply returns the same pte value and does not set any of
   the R/W/X bits
This series was validated using libhugetlbfs testsuite ported to riscv64
without linker script support.
(https://github.com/AlexGhiti/libhugetlbfs.git, branch dev/alex/riscv).
- libhugetlbfs testsuite on riscv64/2M:
   - brk_near_huge triggers an assert in malloc.c, does not on x86.
- libhugetlbfs testsuite on riscv64/1G:
   - brk_near_huge triggers an assert in malloc.c, does not on x86.
   - mmap-gettest, mmap-cow: testsuite passes the number of default free
     pages as parameters and then fails for 1G which is not the default.
     Otherwise succeeds when given the right number of pages.
   - map_high_truncate_2 fails on x86 too: 0x60000000 is not 1G aligned
     and fails at line 694 of fs/hugetlbfs/inode.c.
   - heapshrink on 1G fails on x86 too, not investigated.
   - counters.sh on 1G fails on x86 too: alloc_surplus_huge_page returns
     NULL in case of gigantic pages.
   - icache-hygiene succeeds after patch #3 of this series which lowers
     the base address of mmap.
   - fallocate_stress.sh on 1G never ends, on x86 too, not investigated.
- libhugetlbfs testsuite on riscv32/4M: kernel build passes, lacks
   libhugetlbfs support for 32bits.
* Output for riscv64 2M and 1G libhugetbfs testsuite:
zero_filesize_segment (2M: 64):
zero_filesize_segment (1024M: 64):
test_root (2M: 64):     PASS
test_root (1024M: 64):  PASS
meminfo_nohuge (2M: 64):        PASS
meminfo_nohuge (1024M: 64):     PASS
gethugepagesize (2M: 64):       PASS
gethugepagesize (1024M: 64):    PASS
gethugepagesizes (2M: 64):      PASS
gethugepagesizes (1024M: 64):   PASS
HUGETLB_VERBOSE=1 empty_mounts (2M: 64):        PASS
HUGETLB_VERBOSE=1 empty_mounts (1024M: 64):     PASS
HUGETLB_VERBOSE=1 large_mounts (2M: 64):        PASS
HUGETLB_VERBOSE=1 large_mounts (1024M: 64):     PASS
find_path (2M: 64):     PASS
find_path (1024M: 64):  PASS
unlinked_fd (2M: 64):   PASS
unlinked_fd (1024M: 64):        PASS
readback (2M: 64):      PASS
readback (1024M: 64):   PASS
truncate (2M: 64):      PASS
truncate (1024M: 64):   PASS
shared (2M: 64):        PASS
shared (1024M: 64):     PASS
mprotect (2M: 64):      PASS
mprotect (1024M: 64):   PASS
mlock (2M: 64): PASS
mlock (1024M: 64):      PASS
misalign (2M: 64):      PASS
misalign (1024M: 64):   PASS
fallocate_basic.sh (2M: 64):    PASS
fallocate_basic.sh (1024M: 64): PASS
fallocate_align.sh (2M: 64):    PASS
fallocate_align.sh (1024M: 64): PASS
ptrace-write-hugepage (2M: 64): PASS
ptrace-write-hugepage (1024M: 64):      PASS
icache-hygiene (2M: 64):        PASS
icache-hygiene (1024M: 64):     PASS
slbpacaflush (2M: 64):  PASS (inconclusive)
slbpacaflush (1024M: 64):       PASS (inconclusive)
straddle_4GB_static (2M: 64):   PASS
straddle_4GB_static (1024M: 64):        PASS
huge_at_4GB_normal_below_static (2M: 64):       PASS
huge_at_4GB_normal_below_static (1024M: 64):    PASS
huge_below_4GB_normal_above_static (2M: 64):    PASS
huge_below_4GB_normal_above_static (1024M: 64): PASS
map_high_truncate_2 (2M: 64):   PASS
map_high_truncate_2 (1024M: 64):        FAIL    ftruncate(): Invalid
argument
misaligned_offset (2M: 64):     PASS (inconclusive)
misaligned_offset (1024M: 64):  PASS (inconclusive)
truncate_above_4GB (2M: 64):    PASS
truncate_above_4GB (1024M: 64): PASS
brk_near_huge (2M: 64): brk_near_huge: malloc.c:2385: sysmalloc:
Assertion `(old_top == initial_top (av) && old_size == 0) || ((unsigned
long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long)
old_end & (pagesize - 1)) == 0)' failed.
brk_near_huge (1024M: 64):      brk_near_huge: malloc.c:2385: sysmalloc:
Assertion `(old_top == initial_top (av) && old_size == 0) || ((unsigned
long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long)
old_end & (pagesize - 1)) == 0)' failed.
task-size-overrun (2M: 64):     PASS
task-size-overrun (1024M: 64):  PASS
stack_grow_into_huge (2M: 64):   PASS
stack_grow_into_huge (1024M: 64): PASS
corrupt-by-cow-opt (2M: 64):    PASS
corrupt-by-cow-opt (1024M: 64): PASS
noresv-preserve-resv-page (2M: 64):     PASS
noresv-preserve-resv-page (1024M: 64):  PASS
noresv-regarded-as-resv (2M: 64):       PASS
noresv-regarded-as-resv (1024M: 64):    PASS
readahead_reserve.sh (2M: 64):  PASS
readahead_reserve.sh (1024M: 64):       PASS
madvise_reserve.sh (2M: 64):    PASS
madvise_reserve.sh (1024M: 64): PASS
fadvise_reserve.sh (2M: 64):    PASS
fadvise_reserve.sh (1024M: 64): PASS
mremap-expand-slice-collision.sh (2M: 64):      PASS
mremap-expand-slice-collision.sh (1024M: 64):   PASS
mremap-fixed-normal-near-huge.sh (2M: 64):      PASS
mremap-fixed-normal-near-huge.sh (1024M: 64):   PASS
mremap-fixed-huge-near-normal.sh (2M: 64):      PASS
mremap-fixed-huge-near-normal.sh (1024M: 64):   PASS
set shmmax limit to 67108864
shm-perms (2M: 64):     PASS
private (2M: 64):       PASS
private (1024M: 64):    PASS
fork-cow (2M: 64):      PASS
fork-cow (1024M: 64):   PASS
direct (2M: 64):        Bad configuration: Failed to open direct-IO
file: Invalid argument
direct (1024M: 64):     Bad configuration: Failed to open direct-IO
file: File exists
malloc (2M: 64):        PASS
malloc (1024M: 64):     PASS
LD_PRELOAD=libhugetlbfs.so HUGETLB_MORECORE=yes malloc (2M: 64):
PASS
LD_PRELOAD=libhugetlbfs.so HUGETLB_MORECORE=yes malloc (1024M: 64):
PASS
LD_PRELOAD=libhugetlbfs.so HUGETLB_RESTRICT_EXE=unknown:none
HUGETLB_MORECORE=yes malloc (2M: 64):      PASS
LD_PRELOAD=libhugetlbfs.so HUGETLB_RESTRICT_EXE=unknown:none
HUGETLB_MORECORE=yes malloc (1024M: 64):PASS
LD_PRELOAD=libhugetlbfs.so HUGETLB_RESTRICT_EXE=unknown:malloc
HUGETLB_MORECORE=yes malloc (2M: 64):    PASS
LD_PRELOAD=libhugetlbfs.so HUGETLB_RESTRICT_EXE=unknown:malloc
HUGETLB_MORECORE=yes malloc (1024M: 64): PASS
malloc_manysmall (2M: 64):      PASS
malloc_manysmall (1024M: 64):   PASS
LD_PRELOAD=libhugetlbfs.so HUGETLB_MORECORE=yes malloc_manysmall (2M:
64):      PASS
LD_PRELOAD=libhugetlbfs.so HUGETLB_MORECORE=yes malloc_manysmall (1024M:
64):   PASS
heapshrink (2M: 64):    PASS
heapshrink (1024M: 64): PASS
LD_PRELOAD=libheapshrink.so heapshrink (2M: 64):        PASS
LD_PRELOAD=libheapshrink.so heapshrink (1024M: 64):     PASS
LD_PRELOAD=libhugetlbfs.so HUGETLB_MORECORE=yes heapshrink (2M: 64):
PASS
LD_PRELOAD=libhugetlbfs.so HUGETLB_MORECORE=yes heapshrink (1024M: 64):
PASS
LD_PRELOAD=libhugetlbfs.so libheapshrink.so HUGETLB_MORECORE=yes
heapshrink (2M: 64):   PASS
LD_PRELOAD=libhugetlbfs.so libheapshrink.so HUGETLB_MORECORE=yes
heapshrink (1024M: 64):        PASS
LD_PRELOAD=libheapshrink.so HUGETLB_MORECORE_SHRINK=yes
HUGETLB_MORECORE=yes heapshrink (2M: 64):       PASS (inconclusive)
LD_PRELOAD=libheapshrink.so HUGETLB_MORECORE_SHRINK=yes
HUGETLB_MORECORE=yes heapshrink (1024M: 64):    PASS (inconclusive)
LD_PRELOAD=libhugetlbfs.so libheapshrink.so HUGETLB_MORECORE_SHRINK=yes
HUGETLB_MORECORE=yes heapshrink (2M: 64):       PASS
LD_PRELOAD=libhugetlbfs.so libheapshrink.so HUGETLB_MORECORE_SHRINK=yes
HUGETLB_MORECORE=yes heapshrink (1024M: 64):    FAIL    Heap did not
shrink
HUGETLB_VERBOSE=1 HUGETLB_MORECORE=yes heap-overflow (2M: 64): PASS
HUGETLB_VERBOSE=1 HUGETLB_MORECORE=yes heap-overflow (1024M: 64):
PASS
HUGETLB_VERBOSE=0 linkhuge_nofd (2M: 64):
HUGETLB_VERBOSE=0 linkhuge_nofd (1024M: 64):
LD_PRELOAD=libhugetlbfs.so HUGETLB_VERBOSE=0 linkhuge_nofd (2M: 64):
LD_PRELOAD=libhugetlbfs.so HUGETLB_VERBOSE=0 linkhuge_nofd (1024M: 64):
linkhuge (2M: 64):
linkhuge (1024M: 64):
LD_PRELOAD=libhugetlbfs.so linkhuge (2M: 64):
LD_PRELOAD=libhugetlbfs.so linkhuge (1024M: 64):
linkhuge_rw (2M: 64):
linkhuge_rw (1024M: 64):
HUGETLB_ELFMAP=R linkhuge_rw (2M: 64):
HUGETLB_ELFMAP=R linkhuge_rw (1024M: 64):
HUGETLB_ELFMAP=W linkhuge_rw (2M: 64):
HUGETLB_ELFMAP=W linkhuge_rw (1024M: 64):
HUGETLB_ELFMAP=RW linkhuge_rw (2M: 64):
HUGETLB_ELFMAP=RW linkhuge_rw (1024M: 64):
HUGETLB_ELFMAP=no linkhuge_rw (2M: 64):
HUGETLB_ELFMAP=no linkhuge_rw (1024M: 64):
HUGETLB_ELFMAP=R HUGETLB_MINIMAL_COPY=no linkhuge_rw (2M: 64):
HUGETLB_ELFMAP=R HUGETLB_MINIMAL_COPY=no linkhuge_rw (1024M: 64):
HUGETLB_ELFMAP=W HUGETLB_MINIMAL_COPY=no linkhuge_rw (2M: 64):
HUGETLB_ELFMAP=W HUGETLB_MINIMAL_COPY=no linkhuge_rw (1024M: 64):
HUGETLB_ELFMAP=RW HUGETLB_MINIMAL_COPY=no linkhuge_rw (2M: 64):
HUGETLB_ELFMAP=RW HUGETLB_MINIMAL_COPY=no linkhuge_rw (1024M: 64):
HUGETLB_SHARE=0 HUGETLB_ELFMAP=R linkhuge_rw (2M: 64):
HUGETLB_SHARE=0 HUGETLB_ELFMAP=R linkhuge_rw (1024M: 64):
HUGETLB_SHARE=1 HUGETLB_ELFMAP=R linkhuge_rw (2M: 64):
HUGETLB_SHARE=1 HUGETLB_ELFMAP=R linkhuge_rw (1024M: 64):
HUGETLB_SHARE=0 HUGETLB_ELFMAP=W linkhuge_rw (2M: 64):
HUGETLB_SHARE=0 HUGETLB_ELFMAP=W linkhuge_rw (1024M: 64):
HUGETLB_SHARE=1 HUGETLB_ELFMAP=W linkhuge_rw (2M: 64):
HUGETLB_SHARE=1 HUGETLB_ELFMAP=W linkhuge_rw (1024M: 64):
HUGETLB_SHARE=0 HUGETLB_ELFMAP=RW linkhuge_rw (2M: 64):
HUGETLB_SHARE=0 HUGETLB_ELFMAP=RW linkhuge_rw (1024M: 64):
HUGETLB_SHARE=1 HUGETLB_ELFMAP=RW linkhuge_rw (2M: 64):
HUGETLB_SHARE=1 HUGETLB_ELFMAP=RW linkhuge_rw (1024M: 64):
chunk-overcommit (2M: 64):      PASS
chunk-overcommit (1024M: 64):   PASS
alloc-instantiate-race shared (2M: 64): PASS
alloc-instantiate-race shared (1024M: 64):      PASS
alloc-instantiate-race private (2M: 64):        PASS
alloc-instantiate-race private (1024M: 64):     PASS
truncate_reserve_wraparound (2M: 64):   PASS
truncate_reserve_wraparound (1024M: 64):        PASS
truncate_sigbus_versus_oom (2M: 64):    PASS
truncate_sigbus_versus_oom (1024M: 64): PASS
get_huge_pages (2M: 64):        PASS
get_huge_pages (1024M: 64):     PASS
shmoverride_linked (2M: 64):    PASS
HUGETLB_SHM=yes shmoverride_linked (2M: 64):    PASS
shmoverride_linked_static (2M: 64):
HUGETLB_SHM=yes shmoverride_linked_static (2M: 64):
LD_PRELOAD=libhugetlbfs.so shmoverride_unlinked (2M: 64): PASS
LD_PRELOAD=libhugetlbfs.so HUGETLB_SHM=yes shmoverride_unlinked (2M:
64):       PASS
quota.sh (2M: 64):      PASS
quota.sh (1024M: 64):   PASS
counters.sh (2M: 64):   PASS
counters.sh (1024M: 64):        FAIL mmap failed: Invalid argument
mmap-gettest 10 35 (2M: 64):    PASS
mmap-gettest 10 35 (1024M: 64): FAIL    Failed to mmap the hugetlb file:
Cannot allocate memory
mmap-cow 34 35 (2M: 64):        PASS
mmap-cow 34 35 (1024M: 64):     FAIL    Thread 15 (pid=514) failed
set shmmax limit to 73400320
shm-fork 10 17 (2M: 64):        PASS
set shmmax limit to 73400320
shm-fork 10 35 (2M: 64):        PASS
set shmmax limit to 73400320
shm-getraw 35 /dev/full (2M: 64):       PASS
fallocate_stress.sh (2M: 64):   libgcc_s.so.1 must be installed for
pthread_cancel to work
fallocate_stress.sh (1024M: 64):
********** TEST SUMMARY
*                      2M             1024M
*                      32-bit 64-bit  32-bit 64-bit
*     Total testcases:     0     93       0     83
*             Skipped:     0      0       0      0
*                PASS:     0     69       0     56
*                FAIL:     0      0       0      5
*    Killed by signal:     0      1       0      2
*   Bad configuration:     0      1       0      1
*       Expected FAIL:     0      0       0      0
*     Unexpected PASS:     0      0       0      0
*    Test not present:     0     21       0     19
* Strange test result:     0      1       0      0
**********
Changes in v2:
   - Merge comment fix about task size.
   - Patch about mmap base address was merged separately.
   - Rebased on top of linux-next where series about the capability
     to free gigantic pages regardless of the configuration was merged.
   - Add huge pmd sharing as suggested by Mike Kravetz.
   - Gigantic page hstate is automatically created if CONTIG_ALLOC is
     set, even if not explicitly asked for in command line, as suggested
     by Mike.
   - Replace #ifdef CONFIG_64BIT into IS_ENABLED(CONFIG_64BIT), as suggested
     by Christoph Hellwig.

Alexandre Ghiti (2):
   x86, arm64: Move ARCH_WANT_HUGE_PMD_SHARE config in arch/Kconfig
   riscv: Introduce huge page support for 32/64bit kernel

  arch/Kconfig                     |  3 +++
  arch/arm64/Kconfig               |  2 +-
  arch/riscv/Kconfig               |  8 ++++++
  arch/riscv/include/asm/hugetlb.h | 18 +++++++++++++
  arch/riscv/include/asm/page.h    | 10 ++++++++
  arch/riscv/include/asm/pgtable.h |  8 ++++--
  arch/riscv/mm/Makefile           |  2 ++
  arch/riscv/mm/hugetlbpage.c      | 44 ++++++++++++++++++++++++++++++++
  arch/x86/Kconfig                 |  4 +--
  9 files changed, 93 insertions(+), 6 deletions(-)
  create mode 100644 arch/riscv/include/asm/hugetlb.h
  create mode 100644 arch/riscv/mm/hugetlbpage.c