Re: [PATCH v2] of: use hash based search in of_find_node_by_phandle

From: Chintan Pandya
Date: Mon Jan 29 2018 - 02:35:03 EST



I was curious, so I implemented it. It ends up being similar to Rasmus's
1st suggestion. The difference is we don't try to store all entries, but
rather implement a hash table that doesn't handle collisions. Relying on
the fact that phandles are just linearly allocated from 0, we just mask
the high bits of the phandle to get the index.
I think this is most resourceful way.
Can you try out on your setup and try different
array sizes.
Here are my test results. However, I simply considered overall boot time to
compare different scenarios because profiling of_find_node_by_phandle() in
early boot fails.

Scenarios:
[1] Cache size 1024 + early cache build up [Small change in your cache patch,
ÂÂÂ see the patch below]
[2] Hash 64 approach[my original v2 patch]
[3] Cache size 64
[4] Cache size 128
[5] Cache size 256
[6] Base build

Result (boot to shell in sec):
[1] 14.292498 14.370994 14.313537 --> 850ms avg gain
[2] 14.340981 14.395900 14.398149 --> 800ms avg gain
[3] 14.546429 14.488783 14.468694 --> 680ms avg gain
[4] 14.506007 14.497487 14.523062 --> 670ms avg gain
[5] 14.671100 14.643344 14.731853 --> 500ms avg gain
[6] 15.263386 15.246155 15.041847 --> 0

Previously reported 400ms gain for [2] was from different set up. These tests
and new data is from my own debug set up. When we take any of these patch to
production, result might deviate accordingly.

Chintan Pandya

Patch for the case [1]
<--------------------------------------------------------------------------->

diff --git a/drivers/of/base.c b/drivers/of/base.c
index a0bccb5..671e7cf 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -1084,6 +1084,8 @@ int of_modalias_node(struct device_node *node, char *modalias, int len)
Â}
ÂEXPORT_SYMBOL_GPL(of_modalias_node);

+struct device_node *phandle_cache[PHANDLE_CACHE_SZ];
+
Â/**
 * of_find_node_by_phandle - Find a node given a phandle
 * @handle: phandle of the node to find
@@ -1100,9 +1102,15 @@ struct device_node *of_find_node_by_phandle(phandle handle)
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ return NULL;

ÂÂÂÂÂÂÂ raw_spin_lock_irqsave(&devtree_lock, flags);
-ÂÂÂÂÂÂ for_each_of_allnodes(np)
-ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ if (np->phandle == handle)
-ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ break;
+ÂÂÂÂÂÂ np = phandle_cache[PHANDLE_CACHE_HASH(handle)];
+ÂÂÂÂÂÂ if (!np || np->phandle != handle) {
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ for_each_of_allnodes(np)
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ if (np->phandle == handle) {
+ phandle_cache[PHANDLE_CACHE_HASH(handle)] = np;
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ break;
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ }
+
+ÂÂÂÂÂÂ }
ÂÂÂÂÂÂÂ of_node_get(np);
ÂÂÂÂÂÂÂ raw_spin_unlock_irqrestore(&devtree_lock, flags);
ÂÂÂÂÂÂÂ return np;
diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index c0914fb..e90a458 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -227,6 +227,9 @@ static void populate_properties(const void *blob,
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ pprevÂÂÂÂÂ = &pp->next;
ÂÂÂÂÂÂÂ }

+ÂÂÂÂÂÂ if (!dryrun && np->phandle)
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ phandle_cache[PHANDLE_CACHE_HASH(np->phandle)] = np;
+
ÂÂÂÂÂÂÂ /* With version 0x10 we may not have the name property,
ÂÂÂÂÂÂÂÂ * recreate it here from the unit name if absent
ÂÂÂÂÂÂÂÂ */
diff --git a/include/linux/of.h b/include/linux/of.h
index 299aeb1..b1200be 100644
--- a/include/linux/of.h
+++ b/include/linux/of.h
@@ -92,6 +92,10 @@ struct of_phandle_iterator {
ÂÂÂÂÂÂÂ struct device_node *node;
Â};

+#define PHANDLE_CACHE_SZÂÂÂÂÂÂ 1024
+#define PHANDLE_CACHE_HASH(x)Â ((x) & (PHANDLE_CACHE_SZ - 1))
+extern struct device_node *phandle_cache[PHANDLE_CACHE_SZ];
+
Âstruct of_reconfig_data {
ÂÂÂÂÂÂÂ struct device_nodeÂÂÂÂÂ *dn;
ÂÂÂÂÂÂÂ struct propertyÂÂÂÂÂÂÂÂ *prop;

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project