Re: [regression -next0117] What is kcompactd and why is he eating 100% of my cpu?

From: Jan Kara
Date: Mon Jan 28 2019 - 04:16:33 EST


On Sun 27-01-19 16:36:34, valdis.kletnieks@xxxxxx wrote:
> On Sun, 27 Jan 2019 17:00:27 +0100, Pavel Machek said:
> > > > I've noticed this as well on earlier kernels (next-20181224 to 20190115)
> > > > Some more info:
> > > > 1) echo 3 > /proc/sys/vm/drop_caches unwedges kcompactd in 1-3 seconds.
> > > This aspect is curious as it indicates that kcompactd could potentially
> > > be infinite looping but it's not something I've experienced myself. By
> > > any chance is there a preditable reproduction case for this?
> >
> > I seen it exactly once, so not sure how reproducible this is. x86-32
> > machine, running chromium browser, so yes, there was some swapping
> > involved.
>
> I don't have a surefire replicator, but my laptop (x86_64, so it's not a 32-bit
> only issue) triggers it fairly often, up to multiple times a day. Doesn't seem to
> be just the Chrome browser that triggers it - usually I'm doing other stuff as
> well, like a compile or similar. The fact that 'drop_caches' clears it makes me
> wonder if we're hitting a corner case where cache data isn't being automatically
> cleared and clogging something up.

So my buffer_migrate_page_norefs() is certainly buggy in its current
incarnation (as a result block device page cache is not migratable at all).
I've sent Andrew a patch over week ago but so far it got ignored. The patch
is attached, can you give it a try whether it changes something for you?
Thanks!

Honza
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR