Re: [PATCH] vmscan: fix missing place to check nr_swap_pages.

From: Ying Han
Date: Fri Aug 27 2010 - 12:36:05 EST


On Thu, Aug 26, 2010 at 10:00 PM, Minchan Kim <minchan.kim@xxxxxxxxx> wrote:
>
> On Fri, Aug 27, 2010 at 12:31 PM, Ying Han <yinghan@xxxxxxxxxx> wrote:
> > On Thu, Aug 26, 2010 at 6:03 PM, Minchan Kim <minchan.kim@xxxxxxxxx> wrote:
> >>
> >> Hello.
> >>
> >> On Fri, Aug 27, 2010 at 9:11 AM, Ying Han <yinghan@xxxxxxxxxx> wrote:
> >> > Fix a missed place where checks nr_swap_pages to do shrink_active_list. Make the
> >> > change that moves the check to common function inactive_anon_is_low.
> >> >
> >>
> >> Hmm.. AFAIR, we discussed it at that time but we concluded it's not good.
> >> That's because nr_swap_pages < 0 means both "NO SWAP" and "NOT enough
> >> swap space now". If we have a swap device or file but not enough space
> >> now, we need to aging anon pages to make inactive list enough size.
> >> Otherwise, working set pages would be swapped out more fast before
> >> promotion.
> >
> > We found the problem on one of our workloads where more TLB flush
> > happens without the change. Kswapd seems to be calling
> > shrink_active_list() which eventually clears access bit of those ptes
> > and does TLB flush
> > with ptep_clear_flush_young(). This system does not have swap
> > configured, and why aging the anon lru in that
> > case?
>
> True. I also wanted it but we have to care swap configured but
> non-enabling still yet system as well as non-swap configured system at
> that time.

Agree.  In our case, we cares about the case where swap is not enabled
but is configured .
>
> If your system is no swap configured, how about this?
> (It's a not formal proper patch but just quick patch to show the concept).

In our system, we do have swap configured. In vmscan.c, there are
couple of places where we skip scanning
and shrinking anon lru while the condition if(nr_swap_pages <= 0)  is
true. It still make sense to me to add it
to the shrink_active() condition as the initial patch.

Also, we found it is quite often to hit the condition
inactive_anon_is_low on machine with small numa node size, since the
zone->inactive_ratio is set based on the zone->present_pages.

--Ying

>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 3109ff7..641c6a6 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1580,6 +1580,11 @@ static void shrink_active_list(unsigned long
> nr_pages, struct zone *zone,
>        spin_unlock_irq(&zone->lru_lock);
>  }
>
> +/*
> + * If system doesn't have a swap configuration,
> + * it doesn't need to age anon pages in kswapd.
> + */
> +#ifdef CONFIG_SWAP
>  static int inactive_anon_is_low_global(struct zone *zone)
>  {
>        unsigned long active, inactive;
> @@ -1611,6 +1616,12 @@ static int inactive_anon_is_low(struct zone
> *zone, struct scan_control *sc)
>                low = mem_cgroup_inactive_anon_is_low(sc->mem_cgroup);
>        return low;
>  }
> +#else
> +static inline int inactive_anon_is_low(struct zone *zone, struct
> scan_control *sc)
> +{
> +       return 0;
> +}
> +#endif
>
>  static int inactive_file_is_low_global(struct zone *zone)
>  {
>
>
> --
> Kind regards,
> Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/