Re: [PATCH v3 02/33] lib/string_helpers: Convert string_escape_mem() to printbuf

From: Petr Mladek
Date: Thu Jun 09 2022 - 10:25:56 EST


On Sat 2022-06-04 15:30:11, Kent Overstreet wrote:
> Like the upcoming vsprintf.c conversion, this converts string_escape_mem
> to prt_escaped_string(), which uses and outputs to a printbuf, and makes
> string_escape_mem() a smaller wrapper to support existing users.
>
> The new printbuf helpers greatly simplify the code.
>
> --- a/lib/string_helpers.c
> +++ b/lib/string_helpers.c
> @@ -15,6 +15,7 @@
> #include <linux/fs.h>
> #include <linux/limits.h>
> #include <linux/mm.h>
> +#include <linux/printbuf.h>
> #include <linux/slab.h>
> #include <linux/string.h>
> #include <linux/string_helpers.h>
> @@ -301,19 +302,14 @@ int string_unescape(char *src, char *dst, size_t size, unsigned int flags)
> }
> EXPORT_SYMBOL(string_unescape);
>
> -static bool escape_passthrough(unsigned char c, char **dst, char *end)
> +static bool escape_passthrough(struct printbuf *out, unsigned char c)
> {
> - char *out = *dst;
> -
> - if (out < end)
> - *out = c;
> - *dst = out + 1;
> + prt_char(out, c);

This modifies the behavior. The original code did not add
the trailing '\0'.

I agree that the original behavior is ugly but it is documented
see the comment:

* Return:
* The total size of the escaped output that would be generated for
* the given input and flags. To check whether the output was
* truncated, compare the return value to osz. There is room left in
* dst for a '\0' terminator if and only if ret < osz.
^^^^^^^^^^^^^^


I am all for changing the behavior but it would require checking
all callers.

Anyway, adding the trailing '\0' all is not much effective.
I suggest to use __prt_char() and add the trailing '\0' when
the string is complete.

We must make sure that __prt_char() is able to add the last
character even when there will not longer be space for
the trailing '\0'.


> return true;
> }
>

[...]

>
> /**
> - * string_escape_mem - quote characters in the given memory buffer
> + * prt_escaped_string - quote characters in the given memory buffer
> + * @out: printbuf to output to (escaped)
> * @src: source buffer (unescaped)
> * @isz: source buffer size
> - * @dst: destination buffer (escaped)
> - * @osz: destination buffer size
> * @flags: combination of the flags
> * @only: NULL-terminated string containing characters used to limit
> * the selected escape class. If characters are included in @only
> @@ -517,11 +466,10 @@ static bool escape_hex(unsigned char c, char **dst, char *end)
> * truncated, compare the return value to osz. There is room left in
> * dst for a '\0' terminator if and only if ret < osz.
> */
> -int string_escape_mem(const char *src, size_t isz, char *dst, size_t osz,
> - unsigned int flags, const char *only)
> +void prt_escaped_string(struct printbuf *out,
> + const char *src, size_t isz,
> + unsigned int flags, const char *only)
> {
> - char *p = dst;
> - char *end = p + osz;
> bool is_dict = only && *only;
> bool is_append = flags & ESCAPE_APPEND;
>
> @@ -549,41 +497,49 @@ int string_escape_mem(const char *src, size_t isz, char *dst, size_t osz,
> * %ESCAPE_NA cases.
> */
> if (!(is_append || in_dict) && is_dict &&
> - escape_passthrough(c, &p, end))
> + escape_passthrough(out, c))
> continue;
>
> if (!(is_append && in_dict) && isascii(c) && isprint(c) &&
> - flags & ESCAPE_NAP && escape_passthrough(c, &p, end))
> + flags & ESCAPE_NAP && escape_passthrough(out, c))
> continue;
>
> if (!(is_append && in_dict) && isprint(c) &&
> - flags & ESCAPE_NP && escape_passthrough(c, &p, end))
> + flags & ESCAPE_NP && escape_passthrough(out, c))
> continue;
>
> if (!(is_append && in_dict) && isascii(c) &&
> - flags & ESCAPE_NA && escape_passthrough(c, &p, end))
> + flags & ESCAPE_NA && escape_passthrough(out, c))
> continue;
>
> - if (flags & ESCAPE_SPACE && escape_space(c, &p, end))
> + if (flags & ESCAPE_SPACE && escape_space(out, c))
> continue;
>
> - if (flags & ESCAPE_SPECIAL && escape_special(c, &p, end))
> + if (flags & ESCAPE_SPECIAL && escape_special(out, c))
> continue;
>
> - if (flags & ESCAPE_NULL && escape_null(c, &p, end))
> + if (flags & ESCAPE_NULL && escape_null(out, c))
> continue;
>
> /* ESCAPE_OCTAL and ESCAPE_HEX always go last */
> - if (flags & ESCAPE_OCTAL && escape_octal(c, &p, end))
> + if (flags & ESCAPE_OCTAL && escape_octal(out, c))
> continue;
>
> - if (flags & ESCAPE_HEX && escape_hex(c, &p, end))
> + if (flags & ESCAPE_HEX && escape_hex(out, c))
> continue;
>
> - escape_passthrough(c, &p, end);
> + escape_passthrough(out, c);
> }
> +}
> +EXPORT_SYMBOL(prt_escaped_string);
> +
> +int string_escape_mem(const char *src, size_t isz, char *dst, size_t osz,
> + unsigned int flags, const char *only)

We need keep the comment above this API as long as it is public.

Note that it uses the docbook comment format, starging with /** ...
And the symbol is exported.

It is even more important because of the crazy behavior where
it does not add the trailing '\0'.

> +{
> + struct printbuf out = PRINTBUF_EXTERN(dst, osz);
>
> - return p - dst;
> + prt_escaped_string(&out, src, isz, flags, only);
> + return out.pos;
> }
> EXPORT_SYMBOL(string_escape_mem);

Note: If you decided to change/fix the behavior then please do so
in a separate patch(set). I was able to catch the problem
because the patch was straightforward and "easy" to review.

Best Regards,
Petr