Re: [PATCH] x86_64: Make NR_IRQS configurable in Kconfig

From: Dave Hansen
Date: Wed Aug 09 2006 - 14:23:44 EST

On Wed, 2006-08-09 at 10:58 -0700, Luck, Tony wrote:
> On Tue, Aug 08, 2006 at 10:17:53AM +0200, Martin Schwidefsky wrote:
> > "vmalloc reserve first; allocate pages later" would be a really nice
> > feature. We could use this on s390 to implement the virtual mem_map
> > array spanning the whole 64 bit address range (with holes in it). To
> > make it perfect a "deallocate pages; keep vmalloc reserve" should be
> > added, then we could free parts of the mem_map array again on hot memory
> > remove.


We can already do this partial freeing today with sparsemem and memory
hot-remove. It would be a shame to go have to do another implementation
for each an every architecture that wants to do it.

For the very sparse 64-bit address spaces, I would be really interested
to see an alternate pfn_to_section_nr() that relies on something other
than a direct correlation between physical address and section number.

Instead of:

#define pfn_to_section_nr(pfn) ((pfn) >> PFN_SECTION_SHIFT)

We could do:

static inline unsigned long pfn_to_section_nr(unsigned long pfn)
return some_hash(pfn) % NR_OF_SECTION_SLOTS;

This would, of course, still have limits on how _many_ sections can be
populated. But, it would remove the relationship on what the actual
physical address ranges can be from the number of populated sections.

Of course, it isn't quite that simple. You need to make sure that the
sparse code is clean from all connections between section number and
physical address, as well as handling things like hash collisions. We'd
probably also need to store the _actual_ physical address somewhere
because we can't get it from the section number any more.

But, Andy and I have talked about this kind of thing from the beginning
of sparsemem, so I hope the code is amenable to change like this.

-- Dave

P.S. With sparsemem extreme, I think you can cover an entire 64-bits of
address space with a 4GB top-level table. If one more level of tables
was added, we'd be down to (I think) an 8MB table. So, that might be an
option, too.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at