Re: [RFC][PATCH] mm/page_isolation: tracing: trace all test_pages_isolated failures

From: George G. Davis
Date: Thu Sep 02 2021 - 18:21:59 EST


On Tue, Aug 31, 2021 at 04:53:31PM +0200, David Hildenbrand wrote:
> On 23.08.21 22:28, George G. Davis wrote:
> > From: "George G. Davis" <davis.george@xxxxxxxxxxx>
> >
> > Some test_pages_isolated failure conditions don't include trace points.
> > For debugging issues caused by "pinned" pages, make sure to trace all
> > calls whether they succeed or fail. In this case, a failure case did not
> > result in a trace point. So add the missing failure case in
> > test_pages_isolated traces.
>
> In which setups did you actually run into these cases?

Good question!

Although I'm not 100% certain that this specific failure condition has
occurred in my recent testing, I'm able to reproduce cma_alloc -EBUSY
faiure conditions when testing latest/recent master on arm64 based
Renesas R-Car Starter Kit [1] using defconfig with
CONFIG_CMA_SIZE_MBYTES=384 while running the following test case:

trace-cmd record -N 192.168.1.87:12345 -b 4096 -e cma -e page_isolation -e compaction -e migrate &
sleep 10
while true; do a=$(( ( RANDOM % 10000 ) + 1 )); echo $a > /sys/kernel/debug/cma/cma-reserved/alloc && (usleep $a; echo $a > /sys/kernel/debug/cma/cma-reserved/free); done &
while true; do b=$(( ( RANDOM % 10000 ) + 1 )); echo $b > /sys/kernel/debug/cma/cma-reserved/alloc && (usleep $b; echo $b > /sys/kernel/debug/cma/cma-reserved/free); done &
while true; do c=$(( ( RANDOM % 10000 ) + 1 )); echo $c > /sys/kernel/debug/cma/cma-reserved/alloc && (usleep $c; echo $c > /sys/kernel/debug/cma/cma-reserved/free); done &
while true; do d=$(( ( RANDOM % 10000 ) + 1 )); echo $d > /sys/kernel/debug/cma/cma-reserved/alloc && (usleep $d; echo $d > /sys/kernel/debug/cma/cma-reserved/free); done &
while true; do e=$(( ( RANDOM % 10000 ) + 1 )); echo $e > /sys/kernel/debug/cma/cma-reserved/alloc && (usleep $e; echo $e > /sys/kernel/debug/cma/cma-reserved/free); done &
/selftests/vm/transhuge-stress &

The cma_alloc -EBUSY failures are caused by THP compound pages allocated
from the CMA region where migration does not seem to work for compound
THP pages. The work around is to disable CONFIG_TRANSPARENT_HUGEPAGE
since it seems incompatible with the intended use of the CMA region.

>
>
> --
> Thanks,
>
> David / dhildenb
>

--
Regards,
George

[1] https://elinux.org/R-Car/Boards/H3SK