[PATCH v2 10/13] vt: pad double-width code points with a zero-width space

From: Nicolas Pitre
Date: Tue Apr 15 2025 - 15:22:56 EST


From: Nicolas Pitre <npitre@xxxxxxxxxxxx>

In the Unicode screen buffer, we follow double-width code points with a
space to maintain proper column alignment. This, however, creates
semantic problems when e.g. using cut and paste.

Let's use a better code point for the column padding's purpose i.e. a
zero-width space rather than a full space. This way the combination
remains with a width of 2.

Signed-off-by: Nicolas Pitre <npitre@xxxxxxxxxxxx>
---
drivers/tty/vt/vt.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index 76554c2040..1bd1878094 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -2923,6 +2923,7 @@ static void vc_con_rewind(struct vc_data *vc)
vc->vc_need_wrap = 0;
}

+#define UCS_ZWS 0x200b /* Zero Width Space */
#define UCS_VS16 0xfe0f /* Variation Selector 16 */

static int vc_process_ucs(struct vc_data *vc, int *c, int *tc)
@@ -2941,8 +2942,8 @@ static int vc_process_ucs(struct vc_data *vc, int *c, int *tc)
/*
* Let's merge this zero-width code point with the preceding
* double-width code point by replacing the existing
- * whitespace padding. To do so we rewind one column and
- * pretend this has a width of 1.
+ * zero-width space padding. To do so we rewind one column
+ * and pretend this has a width of 1.
* We give the legacy display the same initial space padding.
*/
vc_con_rewind(vc);
@@ -3065,7 +3066,11 @@ static int vc_con_write_normal(struct vc_data *vc, int tc, int c,
tc = conv_uni_to_pc(vc, ' ');
if (tc < 0)
tc = ' ';
- next_c = ' ';
+ /*
+ * Store a zero-width space in the Unicode screen given that
+ * the previous code point is semantically double width.
+ */
+ next_c = UCS_ZWS;
}

out:
--
2.49.0