Re: [PATCH] ARM: sparsemem: allow pfn_valid to be overridden whenusing SPARSEMEM

From: Will Deacon
Date: Thu May 19 2011 - 04:55:30 EST


Hi Mel,

Thanks for looking at this.

On Wed, 2011-05-18 at 17:59 +0100, Mel Gorman wrote:
> On Wed, May 18, 2011 at 05:03:59PM +0100, Will Deacon wrote:
> > In commit eb33575c ("[ARM] Double check memmap is actually valid with a
> > memmap has unexpected holes V2"), a new function, memmap_valid_within,
> > was introduced to mmzone.h so that holes in the memmap which pass
> > pfn_valid in SPARSEMEM configurations can be detected and avoided.
> >
> > The fix to this problem checks that the pfn <-> page linkages are
> > correct by calculating the page for the pfn and then checking that
> > page_to_pfn on that page returns the original pfn. Unfortunately, in
> > SPARSEMEM configurations, this results in reading from the page flags to
> > determine the correct section. Since the memmap here has been freed,
> > junk is read from memory and the check is no longer robust.
> >
> > In the best case, reading from /proc/pagetypeinfo will give you the
> > wrong answer. In the worst case, you get SEGVs, Kernel OOPses and hung
> > CPUs.
> >
> > This patch allows architectures to provide their own pfn_valid function
> > instead of using the default implementation used by sparsemem. The
> > architecture-specific version is aware of the memmap state and will
> > return false when passed a pfn for a freed page within a valid section.
> >
> > Cc: Russell King <linux@xxxxxxxxxxxxxxxx>
> > Cc: Mel Gorman <mgorman@xxxxxxx>
> > Acked-by: Catalin Marinas <catalin.marinas@xxxxxxx>
> > Signed-off-by: Will Deacon <will.deacon@xxxxxxx>
>
> I don't have an ARM machine to test on and I'm not particularly
> sensitive to the requirements of ARM so I'm not the best reviewer. If
> this passes tests, I see little problem with it other than the
> architecture-specific pfn_valid is slower than the sparsemem equivalent
> and the cache footprint is probably higher as memblock_is_memory
> is searching a list of blocks.

Yes, it is slower than just checking to see if the sparsemem section is
valid but that is the price you pay for partially populated sections. At
the end of the day, we're just falling back to the pfn_valid definition
that is used when !CONFIG_SPARSEMEM.

> If this problem is exclusive to
> reading /proc/pagetypeinfo, you might want to consider only using
> memblock_is_memory in that case. Otherwise, functionally it looks like
> it should work.

I initially thought it was exclusive to that operation, but it turns out
the problem is more far-reaching as pfn_valid is used by things like the
ioremap code to ensure that we don't remap normal memory.

> > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> > index e56f835..72225dd 100644
> > --- a/include/linux/mmzone.h
> > +++ b/include/linux/mmzone.h
> > @@ -1053,12 +1053,14 @@ static inline struct mem_section *__pfn_to_section(unsigned long pfn)
> > return __nr_to_section(pfn_to_section_nr(pfn));
> > }
> >
> > +#ifndef CONFIG_ARCH_PROVIDES_PFN_VALID
> > static inline int pfn_valid(unsigned long pfn)
> > {
> > if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
> > return 0;
> > return valid_section(__nr_to_section(pfn_to_section_nr(pfn)));
> > }
> > +#endif
> >
> > static inline int pfn_present(unsigned long pfn)
> > {

Can I add your Ack for the changes to mmzone.h please?

Thanks,

Will

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/