Re: [RFC PATCH 1/1] vmscan: Support multiple kswapd threads per node

From: Michal Hocko
Date: Fri Oct 02 2020 - 10:29:37 EST


On Fri 02-10-20 09:53:05, Rik van Riel wrote:
> On Fri, 2020-10-02 at 09:03 +0200, Michal Hocko wrote:
> > On Thu 01-10-20 18:18:10, Sebastiaan Meijer wrote:
> > > (Apologies for messing up the mailing list thread, Gmail had fooled
> > > me into
> > > believing that it properly picked up the thread)
> > >
> > > On Thu, 1 Oct 2020 at 14:30, Michal Hocko <mhocko@xxxxxxxx> wrote:
> > > > On Wed 30-09-20 21:27:12, Sebastiaan Meijer wrote:
> > > > > > yes it shows the bottleneck but it is quite artificial. Read
> > > > > > data is
> > > > > > usually processed and/or written back and that changes the
> > > > > > picture a
> > > > > > lot.
> > > > > Apologies for reviving an ancient thread (and apologies in
> > > > > advance for my lack
> > > > > of knowledge on how mailing lists work), but I'd like to offer
> > > > > up another
> > > > > reason why merging this might be a good idea.
> > > > >
> > > > > From what I understand, zswap runs its compression on the same
> > > > > kswapd thread,
> > > > > limiting it to a single thread for compression. Given enough
> > > > > processing power,
> > > > > zswap can get great throughput using heavier compression
> > > > > algorithms like zstd,
> > > > > but this is currently greatly limited by the lack of threading.
> > > >
> > > > Isn't this a problem of the zswap implementation rather than
> > > > general
> > > > kswapd reclaim? Why zswap doesn't do the same as normal swap out
> > > > in a
> > > > context outside of the reclaim?
>
> On systems with lots of very fast IO devices, we have
> also seen kswapd take 100% CPU time without any zswap
> in use.

Do you have more details? Does the saturated kswapd lead to pre-mature
direct reclaim? What is the saturated number of reclaimed pages per unit
of time? Have you tried to play with this to see whether an additional
worker would help?

--
Michal Hocko
SUSE Labs