Re: [PATCH RFC 04/14] mm: vmstat: convert slab vmstat counter to bytes
From: Roman Gushchin
Date: Mon Sep 16 2019 - 22:09:36 EST
On Mon, Sep 16, 2019 at 02:38:40PM +0200, Johannes Weiner wrote:
> On Thu, Sep 05, 2019 at 02:45:48PM -0700, Roman Gushchin wrote:
> > In order to prepare for per-object slab memory accounting,
> > convert NR_SLAB_RECLAIMABLE and NR_SLAB_UNRECLAIMABLE vmstat
> > items to bytes.
> >
> > To make sure that these vmstats are in bytes, rename them
> > to NR_SLAB_RECLAIMABLE_B and NR_SLAB_UNRECLAIMABLE_B (similar to
> > NR_KERNEL_STACK_KB).
> >
> > The size of slab memory shouldn't exceed 4Gb on 32-bit machines,
> > so it will fit into atomic_long_t we use for vmstats.
> >
> > Signed-off-by: Roman Gushchin <guro@xxxxxx>
Hello Johannes!
Thank you for looking into the patchset!
>
> Maybe a crazy idea, but instead of mixing bytes and pages, would it be
> difficult to account all vmstat items in bytes internally? And provide
> two general apis, byte and page based, to update and query the counts,
> instead of tying the unit it to individual items?
>
> The vmstat_item_in_bytes() conditional shifting is pretty awkward in
> code that has a recent history littered with subtle breakages.
>
> The translation helper node_page_state_pages() will yield garbage if
> used with the page-based counters, which is another easy to misuse
> interface.
>
> We already have many places that multiply with PAGE_SIZE to get the
> stats in bytes or kb units.
>
> And _B/_KB suffixes are kinda clunky.
>
> The stats use atomic_long_t, so switching to atomic64_t doesn't make a
> difference on 64-bit and is backward compatible with 32-bit.
I fully agree here, that having different stats in different units
adds a lot of mess to the code. But I always thought that 64-bit
atomics are slow on a 32-bit machine, so it might be a noticeable
performance regression. Don't you think so?
I'm happy to prepare such a patch(set), only I'd prefer to keep it
separately from this one. It can precede or follow the slab controller
rework, either way will work. Slab controller rework is already not so
small, so adding more code (and potential issues) here will only make
the review more complex.
>
> The per-cpu batch size you have to raise from s8 either way.
Yeah, tbh I don't know why those are just not unsigned long by default.
Space savings are miserable here, and I don't see any other reasons.
It could be even slightly faster to use a larger type.
I kinda tried to keep the patchset as small as possible (at least for
the RFC version), so tried to avoid any non-necessary changes.
But overall using s8 or s16 here doesn't make much sense to me.
>
> It seems to me that would make the code and API a lot simpler and
> easier to use / harder to misuse.