Re: [regression -next0117] What is kcompactd and why is he eating 100% of my cpu?

From: Tibor Bana
Date: Tue Jan 26 2021 - 05:07:00 EST


Greetings!

I don't know if it still actual, but I am strugling with this problem right now and searching the internet for solutions.
I read the thread and saw that you are strugling to reproduce the problem, and I can reproduce it almost every day.

- Install vmware player, and a linux guest.
- Configure the virtual machine to have a good amount of memory and cpu
- run resource intensive tasks on the guest
- when the host used up almost it's all memory and start to reuse caches kcompactd will kick in.

As I know the problem is related to transparent huge pages, but I tried to disable it.
Today I saw the problem again and kcompactd shown an interesting status in top. It hasn't used any memory, all zeroes but it used up one core completely.

My machine is a core-i7 with 4 physical cores and hyper threading and 24GB Memory
5.9.11-arch2-1 #1 SMP PREEMPT Sat, 28 Nov 2020 02:07:22 +0000 x86_64 GNU/Linux

Hope this can help, to point out the problem.

Tibor Bana

On Wed, 30 Jan 2019 10:40:20 +0000
Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> wrote:

> On Tue, Jan 29, 2019 at 11:29:37PM -0500, valdis.kletnieks@xxxxxx wrote:
> > On Tue, 29 Jan 2019 20:06:39 -0500, valdis.kletnieks@xxxxxx said:
> > > On Mon, 28 Jan 2019 10:16:27 +0100, Jan Kara said:
> > >
> > > > So my buffer_migrate_page_norefs() is certainly buggy in its current
> > > > incarnation (as a result block device page cache is not migratable at all).
> > > > I've sent Andrew a patch over week ago but so far it got ignored. The patch
> > > > is attached, can you give it a try whether it changes something for you?
> > > > Thanks!
> > >
> > > Been running with the patch for about 24 hours, haven't seen kcompactd
> > > misbehave. I even fired up a Chrome with a lot of tabs open, a Firefox, and a
> > > kernel build, intentionally drove the system into swapping, and kcompactd
> > > didn't make it into the top 10 on 'top'.
> > >
> > > I'm willing to say put a "tested-by:" on that one, it looks fixed from here.
> > > If there's any remaining bugs, they're ones I can't seem to trigger...
> >
> > Spoke too soon. Sitting here not stressing the laptop at all, plenty of free
> > memory, and ka-blam.
> >
> > Will keep my eyes open and do the data gathering Mel Gorban wanted - I discovered
> > too late that trace-cmd wasn't installed, and things broke free by themselves (probably
> > not coincidence that I launched a terminal window and then it cleared....)
> >
>
> That's unfortunate. I also note that linux-next still has not been
> updated with the latest version of the compaction series. Nevertheless,
> it might be helpful to get the output of
>
> grep -r . /sys/kernel/mm/transparent_hugepage/*
>
> and the trace when the system is in normal use but kcompactd has not
> pegged at 100%. At minimum, I'd like to see what the sources of high-order
> allocations are and the likely causes of wakeups of kcompactd in case
> there are any hints there. Your Kconfig is also potentially useful.
>
> Thanks.
>
> --
> Mel Gorman
> SUSE Labs


--
Tibor Bana <bana.tibor@xxxxxxxxx>