Re: [PATCH 02/34] mm, vmscan: move lru_lock to the node

From: Balbir Singh
Date: Wed Jul 13 2016 - 01:50:31 EST


On Tue, Jul 12, 2016 at 12:18:05PM +0100, Mel Gorman wrote:
> On Tue, Jul 12, 2016 at 09:06:04PM +1000, Balbir Singh wrote:
> > > diff --git a/Documentation/cgroup-v1/memory.txt b/Documentation/cgroup-v1/memory.txt
> > > index b14abf217239..946e69103cdd 100644
> > > --- a/Documentation/cgroup-v1/memory.txt
> > > +++ b/Documentation/cgroup-v1/memory.txt
> > > @@ -267,11 +267,11 @@ When oom event notifier is registered, event will be delivered.
> > > Other lock order is following:
> > > PG_locked.
> > > mm->page_table_lock
> > > - zone->lru_lock
> > > + zone_lru_lock
> >
> > zone_lru_lock is a little confusing, can't we just call it
> > node_lru_lock?
> >
>
> It's a matter of perspective. People familiar with the VM already expect
> a zone lock so will be looking for it. I can do a rename if you insist
> but it may not actually help.

I don't want to insist, but zone_ in the name can be confusing, as to
leading us to think that the lru_lock is still in the zone

If the rest of the reviewers are fine with, we don't need to rename

>
> > > @@ -496,7 +496,6 @@ struct zone {
> > > /* Write-intensive fields used by page reclaim */
> > >
> > > /* Fields commonly accessed by the page reclaim scanner */
> > > - spinlock_t lru_lock;
> > > struct lruvec lruvec;
> > >
> > > /*
> > > @@ -690,6 +689,9 @@ typedef struct pglist_data {
> > > /* Number of pages migrated during the rate limiting time interval */
> > > unsigned long numabalancing_migrate_nr_pages;
> > > #endif
> > > + /* Write-intensive fields used by page reclaim */
> > > + ZONE_PADDING(_pad1_)a
> >
> > I thought this was to have zone->lock and zone->lru_lock in different
> > cachelines, do we still need the padding here?
> >
>
> The zone padding current keeps the page lock wait tables, page allocator
> lists, compaction and vmstats on separate cache lines. They're still
> fine.
>
> The node padding may not be necessary. It currently ensures that zonelists
> and numa balancing are separate from the LRU lock but there is no guarantee
> the current arrangement is optimal. It would depend on both the kernel
> config and the workload but it may be necessary in the future to split
> node into read-mostly sections and then different write-intensive sections
> similar to what has happened to struct zone in the past.
>

Fair enough

Balbir Singh.