Re: [PATCH 06/11] vt: introduce gen_ucs_recompose.py to create ucs_recompose.c

From: Jiri Slaby
Date: Mon Apr 14 2025 - 03:08:14 EST


On 10. 04. 25, 3:13, Nicolas Pitre wrote:
From: Nicolas Pitre <npitre@xxxxxxxxxxxx>

The generated code includes a table that maps base character + combining
mark pairs to their precomposed equivalents using Python's unicodedata
module. It also provides the ucs_recompose() function to query that
table.

The default script behavior is to create a table with most commonly used
Latin, Greek, and Cyrillic recomposition pairs only. It is much smaller
than the table with all possible recomposition pairs (71 entries vs 1000
entries). But if one needs/wants the full table then simply running the
script with the --full argument will generate it.

Signed-off-by: Nicolas Pitre <npitre@xxxxxxxxxxxx>
---
drivers/tty/vt/gen_ucs_recompose.py | 321 ++++++++++++++++++++++++++++
1 file changed, 321 insertions(+)
create mode 100755 drivers/tty/vt/gen_ucs_recompose.py

diff --git a/drivers/tty/vt/gen_ucs_recompose.py b/drivers/tty/vt/gen_ucs_recompose.py
new file mode 100755
index 0000000000..64418803e4
--- /dev/null
+++ b/drivers/tty/vt/gen_ucs_recompose.py
...
+struct compare_key {{
+ uint16_t base;
+ uint16_t combining;
+}};
+
+static int recomposition_compare(const void *key, const void *element)
+{{
+ const struct compare_key *search_key = key;
+ const struct recomposition *table_entry = element;
+
+ /* Compare base character first */
+ if (search_key->base < table_entry->base)
+ return -1;
+ if (search_key->base > table_entry->base)
+ return 1;
+
+ /* Base characters match, now compare combining character */
+ if (search_key->combining < table_entry->combining)
+ return -1;
+ if (search_key->combining > table_entry->combining)
+ return 1;
+
+ /* Both match */
+ return 0;
+}}
+
+/**
+ * Attempt to recompose two Unicode characters into a single character.
+ *
+ * @param previous: Previous Unicode code point (UCS-4)
+ * @param current: Current Unicode code point (UCS-4)
+ * Return: Recomposed Unicode code point, or 0 if no recomposition is possible
+ */
+uint32_t ucs_recompose(uint32_t base, uint32_t combining)
+{{
+ /* Check if characters are within the range of our table */
+ if (base < MIN_BASE_CHAR || base > MAX_BASE_CHAR ||
+ combining < MIN_COMBINING_CHAR || combining > MAX_COMBINING_CHAR)
+ return 0;
+
+ struct compare_key key = {{ base, combining }};
+
+ struct recomposition *result =
+ __inline_bsearch(&key, recomposition_table,
+ ARRAY_SIZE(recomposition_table),
+ sizeof(*recomposition_table),
+ recomposition_compare);
+
+ return result ? result->recomposed : 0;
+}}

Again, I think no reason to maintain C functions in py.

thanks,
--
js
suse labs