RE: [PATCH] mm: sparse: pass section_nr to section_mark_present

From: 권오훈
Date: Thu Jul 01 2021 - 11:41:56 EST


On Thu, Jul 01, 2021 at 04:34:13PM +0200, David Hildenbrand wrote:
> On 01.07.21 15:55, 권오훈 wrote:
> > With CONFIG_SPARSEMEM_EXTREME enabled, __section_nr() which converts
> > mem_section to section_nr could be costly since it iterates all
> > sections to check if the given mem_section is in its range.
>
> It actually iterates all section roots.
>
> >
> > On the other hand, __nr_to_section which converts section_nr to
> > mem_section can be done in O(1).
> >
> > Let's pass section_nr instead of mem_section ptr to section_mark_present
> > in order to reduce needless iterations.
>
> I'd expect this to be mostly noise, especially as we iterate section
> roots and for most (smallish) machines we might just work on the lowest
> section roots only.
>
> Can you actually observe an improvement regarding boot times?
>
> Anyhow, looks straight forward to me, although we might just reintroduce
> similar patterns again easily if it's really just noise (see
> find_memory_block() as used by). And it might allow for a nice cleanup
> (see below).
>
> Reviewed-by: David Hildenbrand <david@xxxxxxxxxx>
>
>
> Can you send 1) a patch to convert find_memory_block() as well and 2) a
> patch to rip out __section_nr() completely?
>
> >
> > Signed-off-by: Ohhoon Kwon <ohoono.kwon@xxxxxxxxxxx>
> > ---
> > mm/sparse.c | 9 +++++----
> > 1 file changed, 5 insertions(+), 4 deletions(-)
> >
> > diff --git a/mm/sparse.c b/mm/sparse.c
> > index 55c18aff3e42..4a2700e9a65f 100644
> > --- a/mm/sparse.c
> > +++ b/mm/sparse.c
> > @@ -186,13 +186,14 @@ void __meminit mminit_validate_memmodel_limits(unsigned long *start_pfn,
> > * those loops early.
> > */
> > unsigned long __highest_present_section_nr;
> > -static void section_mark_present(struct mem_section *ms)
> > +static void section_mark_present(unsigned long section_nr)
> > {
> > - unsigned long section_nr = __section_nr(ms);
> > + struct mem_section *ms;
> >
> > if (section_nr > __highest_present_section_nr)
> > __highest_present_section_nr = section_nr;
> >
> > + ms = __nr_to_section(section_nr);
> > ms->section_mem_map |= SECTION_MARKED_PRESENT;
> > }
> >
> > @@ -279,7 +280,7 @@ static void __init memory_present(int nid, unsigned long start, unsigned long en
> > if (!ms->section_mem_map) {
> > ms->section_mem_map = sparse_encode_early_nid(nid) |
> > SECTION_IS_ONLINE;
> > - section_mark_present(ms);
> > + section_mark_present(section);
> > }
> > }
> > }
> > @@ -933,7 +934,7 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn,
> >
> > ms = __nr_to_section(section_nr);
> > set_section_nid(section_nr, nid);
> > - section_mark_present(ms);
> > + section_mark_present(section_nr);
> >
> > /* Align memmap to section boundary in the subsection case */
> > if (section_nr_to_pfn(section_nr) != start_pfn)
> >
>
>
> --
> Thanks,
>
> David / dhildenb
>
Dear David.

I tried to check on time for memblocks_present, but when I tested with mobile
phones with 8GB ram, the original binary took 0us either as well as the
patched binary.
I'm not sure how the results would differ on huge systems with bigger ram.
I agree that it could turn out to be just a noise, as you expected.

However as you also mentioned, the patches will be straight forward when all
codes using __section_nr() are cleaned up nicely.

Below are the two patches that you asked for.
Please tell me if you need me to send the patches in separate e-mails.

Thanks,
Ohhoon Kwon.

> Can you send 1) a patch to convert find_memory_block() as well and 2) a
> patch to rip out __section_nr() completely?


-----------------------------------------------------------------------------------------


[PATCH] mm: sparse: pass section_nr to find_memory_block

With CONFIG_SPARSEMEM_EXTREME enabled, __section_nr() which converts
mem_section to section_nr could be costly since it iterates all
sections to check if the given mem_section is in its range.

On the other hand, __nr_to_section which converts section_nr to
mem_section can be done in O(1).

Let's pass section_nr instead of mem_section ptr to find_memory_block
in order to reduce needless iterations.

Signed-off-by: Ohhoon Kwon <ohoono.kwon@xxxxxxxxxxx>
---
arch/powerpc/platforms/pseries/hotplug-memory.c | 4 +---
drivers/base/memory.c | 4 ++--
include/linux/memory.h | 2 +-
3 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
index 8377f1f7c78e..905790092e0e 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -211,13 +211,11 @@ static int update_lmb_associativity_index(struct drmem_lmb *lmb)
static struct memory_block *lmb_to_memblock(struct drmem_lmb *lmb)
{
unsigned long section_nr;
- struct mem_section *mem_sect;
struct memory_block *mem_block;

section_nr = pfn_to_section_nr(PFN_DOWN(lmb->base_addr));
- mem_sect = __nr_to_section(section_nr);

- mem_block = find_memory_block(mem_sect);
+ mem_block = find_memory_block(section_nr);
return mem_block;
}

diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index d5ffaab3cb61..e31598955cc4 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -578,9 +578,9 @@ static struct memory_block *find_memory_block_by_id(unsigned long block_id)
/*
* Called under device_hotplug_lock.
*/
-struct memory_block *find_memory_block(struct mem_section *section)
+struct memory_block *find_memory_block(unsigned long section_nr)
{
- unsigned long block_id = memory_block_id(__section_nr(section));
+ unsigned long block_id = memory_block_id(section_nr);

return find_memory_block_by_id(block_id);
}
diff --git a/include/linux/memory.h b/include/linux/memory.h
index 97e92e8b556a..d9a0b61cd432 100644
--- a/include/linux/memory.h
+++ b/include/linux/memory.h
@@ -90,7 +90,7 @@ int create_memory_block_devices(unsigned long start, unsigned long size,
void remove_memory_block_devices(unsigned long start, unsigned long size);
extern void memory_dev_init(void);
extern int memory_notify(unsigned long val, void *v);
-extern struct memory_block *find_memory_block(struct mem_section *);
+extern struct memory_block *find_memory_block(unsigned long section_nr);
typedef int (*walk_memory_blocks_func_t)(struct memory_block *, void *);
extern int walk_memory_blocks(unsigned long start, unsigned long size,
void *arg, walk_memory_blocks_func_t func);
--
2.17.1


-----------------------------------------------------------------------------------------


[PATCH] mm: sparse: remove __section_nr() function

__section_nr() was used to convert struct mem_section * to section_nr.

With CONFIG_SPARSEMEM_EXTREME enabled, however, __section_nr() can be
costly since it iterates all section roots to check if the given
mem_section is in its range.

On the other hand, __nr_to_section which converts section_nr to
mem_section can be done in O(1).

The only users of __section_nr() was section_mark_present and
find_memory_block. Now since I changed both functions to use section_nr
directly, let's remove __section_nr() since it has no users.

Signed-off-by: Ohhoon Kwon <ohoono.kwon@xxxxxxxxxxx>
---
include/linux/mmzone.h | 1 -
mm/sparse.c | 26 --------------------------
2 files changed, 27 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 0d53eba1c383..8931f95cf885 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1321,7 +1321,6 @@ static inline struct mem_section *__nr_to_section(unsigned long nr)
return NULL;
return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_MASK];
}
-extern unsigned long __section_nr(struct mem_section *ms);
extern size_t mem_section_usage_size(void);

/*
diff --git a/mm/sparse.c b/mm/sparse.c
index 4a2700e9a65f..1b32d15593e4 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -108,32 +108,6 @@ static inline int sparse_index_init(unsigned long section_nr, int nid)
}
#endif

-#ifdef CONFIG_SPARSEMEM_EXTREME
-unsigned long __section_nr(struct mem_section *ms)
-{
- unsigned long root_nr;
- struct mem_section *root = NULL;
-
- for (root_nr = 0; root_nr < NR_SECTION_ROOTS; root_nr++) {
- root = __nr_to_section(root_nr * SECTIONS_PER_ROOT);
- if (!root)
- continue;
-
- if ((ms >= root) && (ms < (root + SECTIONS_PER_ROOT)))
- break;
- }
-
- VM_BUG_ON(!root);
-
- return (root_nr * SECTIONS_PER_ROOT) + (ms - root);
-}
-#else
-unsigned long __section_nr(struct mem_section *ms)
-{
- return (unsigned long)(ms - mem_section[0]);
-}
-#endif
-
/*
* During early boot, before section_mem_map is used for an actual
* mem_map, we use section_mem_map to store the section's NUMA
--
2.17.1