Re: early fixmap causes kmap breakage

From: Nick Piggin
Date: Tue Dec 30 2008 - 03:14:26 EST


On Tue, Dec 30, 2008 at 08:54:28AM +0100, Nick Piggin wrote:
> On Tue, Dec 30, 2008 at 07:13:44AM +0100, Ingo Molnar wrote:
> >
> > * Nick Piggin <npiggin@xxxxxxx> wrote:
> >
> > > On Mon, Dec 29, 2008 at 03:17:31PM -0800, Andrew Morton wrote:
> > > > On Thu, 18 Dec 2008 22:15:43 +0100
> > > > Nick Piggin <npiggin@xxxxxxx> wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I've debugged a problem where i386+pae systems with more than a few CPUs
> > > > > blow up at boot in the kmap_atomic code.
> > > >
> > > > ping?
> > >
> > > No further progress here, I'm waiting on input for how to fix this
> > > "nicely". Meantime, clearing the early fixmap pte I guess works, but you
> > > lose a page... is it possible to put it into .initdata or is there some
> > > issue with that? (I guess on a PAE kernel, 4K isn't a big deal).
> >
> > yeah, 4K shouldnt be a big deal. Mind sending a patch for this?
>
> I don't know if we should put a warning when we encounter this condition,
> to help catch other unintended uses...
>
> But this at least clears out the early fixmap pte and provides a contiguous
> allocation...
>
> ---
> arch/x86/mm/init_32.c | 15 ++++++++++++++-
> 1 file changed, 14 insertions(+), 1 deletion(-)
>
> Index: linux-2.6/arch/x86/mm/init_32.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/mm/init_32.c
> +++ linux-2.6/arch/x86/mm/init_32.c
> @@ -155,6 +155,7 @@ page_table_range_init(unsigned long star
> unsigned long vaddr;
> pgd_t *pgd;
> pmd_t *pmd;
> + pte_t *lastpte = NULL;
>
> vaddr = start;
> pgd_idx = pgd_index(vaddr);
> @@ -166,7 +167,19 @@ page_table_range_init(unsigned long star
> pmd = pmd + pmd_index(vaddr);
> for (; (pmd_idx < PTRS_PER_PMD) && (vaddr != end);
> pmd++, pmd_idx++) {
> - one_page_table_init(pmd);
> + pte_t *pte;
> +
> + pte = one_page_table_init(pmd);
> + if (lastpte && lastpte + PTRS_PER_PTE != pte) {
> + pte_t *newpte;
> +
> + pmd_clear(pmd);

Hmm, no actually we do need a TLB flush here.

> + newpte = one_page_table_init(pmd);
> + BUG_ON(lastpte + PTRS_PER_PTE != newpte);
> + memset(newpte, pte, PAGE_SIZE);

And here we probably need a loop doing set_pte just in case the
CPU speculatively loads an incomplete TLB...

> + pte = lastpte;
> + }
> + lastpte = pte;
>
> vaddr += PMD_SIZE;
> }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/