[PATCH 11/11] vt: pad double-width code points with a zero-white-space

From: Nicolas Pitre
Date: Wed Apr 09 2025 - 21:20:25 EST


From: Nicolas Pitre <npitre@xxxxxxxxxxxx>

In the Unicode screen buffer, we follow double-width code points with a
space to maintain proper column alignment. This, however, creates
semantic problems when e.g. using cut and paste or selection.

Let's use a better code point for the column padding's purpose i.e. a
zero-white-space rather than a full space.

Signed-off-by: Nicolas Pitre <npitre@xxxxxxxxxxxx>
---
drivers/tty/vt/vt.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index e3d35c4f92..dc84f9c6b7 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -2937,12 +2937,13 @@ static int vc_con_write_normal(struct vc_data *vc, int tc, int c,
width = 2;
} else if (ucs_is_zero_width(c)) {
prev_c = vc_uniscr_getc(vc, -1);
- if (prev_c == ' ' &&
+ if (prev_c == 0x200B &&
ucs_is_double_width(vc_uniscr_getc(vc, -2))) {
/*
* Let's merge this zero-width code point with
* the preceding double-width code point by
- * replacing the existing whitespace padding.
+ * replacing the existing zero-white-space
+ * padding.
*/
vc_con_rewind(vc);
} else if (c == 0xfe0f && prev_c != 0) {
@@ -3040,7 +3041,11 @@ static int vc_con_write_normal(struct vc_data *vc, int tc, int c,
tc = conv_uni_to_pc(vc, ' ');
if (tc < 0)
tc = ' ';
- next_c = ' ';
+ /*
+ * Store a zero-white-space in the Unicode screen given that
+ * the previous code point is semantically double-width.
+ */
+ next_c = 0x200B;
}

out:
--
2.49.0