Re: [Bug #10961] 2.6.26-rc: nfsd hangs for a few sec

From: Mel Gorman
Date: Thu Jul 03 2008 - 13:39:21 EST


On (01/07/08 18:04), Alexander Beregalov didst pronounce:
> 2008/7/1 Mel Gorman <mel@xxxxxxxxx>:
> > I still have no useful reaction to this. According to Christoph Hellwig,
> > this lockup has been appearing since lockdep was introduced but for some
> > reason is easier to trigger now. It bisected to the two-zonelist changes
> > but it still looks like a red herring as I cannot see how reclaim has
> > changed significantly as a result of that patch.
>
> Do you wait reaction from me? Can I help?
> As I mentioned, the lockup does not happen when lockdep is disabled.
>

Sorry for the slow response Alexander.

This bug is likely fixed by commit 494de90098784b8e2797598cefdd34188884ec2e
which will be visible publicly later when maintenance on master.kernel.org
finishes. I included it below for convenience.

The lockdep warning still exists but it is a false positive and should be
relatively hard to trigger again. It would be nice to have confirmation
of this.

commit 494de90098784b8e2797598cefdd34188884ec2e
Author: Mel Gorman <mel@xxxxxxxxx>
Date: Thu Jul 3 05:27:51 2008 +0100

Do not overwrite nr_zones on !NUMA when initialising zlcache_ptr

The non-NUMA case of build_zonelist_cache() would initialize the
zlcache_ptr for both node_zonelists[] to NULL.

Which is problematic, since non-NUMA only has a single node_zonelists[]
entry, and trying to zero the non-existent second one just overwrote the
nr_zones field instead.

As kswapd uses this value to determine what reclaim work is necessary,
the result is that kswapd never reclaims. This causes processes to
stall frequently in low-memory situations as they always direct reclaim.
This patch initialises zlcache_ptr correctly.

Signed-off-by: Mel Gorman <mel@xxxxxxxxx>
Tested-by: Dan Williams <dan.j.williams@xxxxxxxxx>
[ Simplified patch a bit ]
Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
---
mm/page_alloc.c | 1 -
1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 2f55295..f32fae3 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2328,7 +2328,6 @@ static void build_zonelists(pg_data_t *pgdat)
static void build_zonelist_cache(pg_data_t *pgdat)
{
pgdat->node_zonelists[0].zlcache_ptr = NULL;
- pgdat->node_zonelists[1].zlcache_ptr = NULL;
}

#endif /* CONFIG_NUMA */

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/