Re: [PATCH 4/4] locking/osq_lock: The numa-aware lock memory prepare, assign and cleanup.

From: Waiman Long
Date: Sat Sep 14 2024 - 13:22:18 EST



On 9/14/24 04:53, yongli-oc wrote:
The numa-aware lock kernel memory cache preparation, and a
workqueue to turn numa-aware lock back to osq lock.
The /proc interface. Enable dynamic switch by
echo 1 > /proc/zx_numa_lock/dynamic_enable

Signed-off-by: yongli-oc <yongli-oc@xxxxxxxxxxx>
---
kernel/locking/zx_numa.c | 537 +++++++++++++++++++++++++++++++++++++++
1 file changed, 537 insertions(+)
create mode 100644 kernel/locking/zx_numa.c

diff --git a/kernel/locking/zx_numa.c b/kernel/locking/zx_numa.c
new file mode 100644
index 000000000000..89df6670a024
--- /dev/null
+++ b/kernel/locking/zx_numa.c
@@ -0,0 +1,537 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Dynamic numa-aware osq lock
+ * Crossing from numa-aware lock to osq_lock
+ * Numa lock memory initialize and /proc interface
+ * Author: LiYong <yongli-oc@xxxxxxxxxxx>
+ *
+ */
+#include <linux/cpumask.h>
+#include <asm/byteorder.h>
+#include <asm/kvm_para.h>
+#include <linux/percpu.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/osq_lock.h>
+#include <linux/module.h>
+#include <linux/proc_fs.h>
+#include <linux/seq_file.h>
+#include <linux/uaccess.h>
+#include <linux/reboot.h>
+
+#include "numa.h"
+#include "numa_osq.h"
+
+int enable_zx_numa_osq_lock;
+struct delayed_work zx_numa_start_work;
+struct delayed_work zx_numa_cleanup_work;
+
+atomic_t numa_count;
+struct _numa_buf *zx_numa_entry;
+int zx_numa_lock_total = 256;
+LIST_HEAD(_zx_numa_head);
+LIST_HEAD(_zx_numa_lock_head);
+
+struct kmem_cache *zx_numa_entry_cachep;
+struct kmem_cache *zx_numa_lock_cachep;
+int NUMASHIFT;
+int NUMACLUSTERS;
+static atomic_t lockindex;
+int dynamic_enable;
+
+static const struct numa_cpu_info numa_cpu_list[] = {
+ /*feature1=1, a numa node includes two clusters*/
+ //{1, 23, X86_VENDOR_AMD, 0, 1},
+ {0x5b, 7, X86_VENDOR_CENTAUR, 0, 1},
+ {0x5b, 7, X86_VENDOR_ZHAOXIN, 0, 1}
+};

Why are this zx_*() code specifically for ZhaoXin and Centaur family of CPUs? Are there some special hardware features that are specific to these CPUs?

BTW, your patch series lacks performance data to justify the addition of quite a lot of complexity to the core locking code. We are unlikely to take this without sufficient justification.

Another question that I have is that the base osq_lock() can coexist with your xz_osq_lock(). A cpu can dynamically switch from using osq_lock() to xz_osq_lock() and vice versa. What happens if some CPUs use osq_lock() while others use xz_osq_lock()? Will that cause a problem? Have you fully test this scenario to make sure that nothing breaks?

Cheers,
Longman