[RFC PATCH 2/7] lib/vsprintf.c: add fmtcheck utility

From: Rasmus Villemoes
Date: Fri Oct 26 2018 - 19:24:45 EST

We have a few places in the kernel where a *printf function is used with
a non-constant format string, making the ordinary static type checking
done by gcc et al. impossible. With extra instrumentation, some things
can still be caught at build time, but that still leaves a number of
places unchecked. So this patch adds a function for doing run-time
verification of a given format string against a template.

The fmtcheck() function takes two format string arguments and checks
whether they contain the same printf specifiers. If they do, the
first (the string-to-be-checked) string is returned. If not, the
second (the template) is returned - the resulting formatted string is
likely garbage, but this should still be better than using arguments of
the wrong type.

Regardless of which string is returned at run-time, the __format_arg
attribute allows the compiler to do type-checking if the fmtcheck()
function is used inside a *printf call, e.g.

sprintf(buf, fmtcheck(what->ever, "%d %lx", 0), i, m)

This also serves as documentation for whoever creates the string found
at what->ever that it should contain these two specifiers.

We actually make fmtcheck() a macro that tries very hard to ensure the
template argument is a string literal - partly to help avoid mixing up
the two "const char*" arguments, partly because much of the point of
this sanity checking vanishes if the template is not a literal (e.g.,
the __format_arg annotation becomes useless).

We don't treat "%*.*s" and "%d %d %s" as equivalent, despite them
taking the same vararg types, since they're morally very distinct. In
fact, at least for now, we don't even treat "%d" and "%u" as
equivalent. We can relax that, possibly via FMTCHECK_* flags, but let's
first see which users there might be and what they'd want.

If either string contains a %p, we really should check the following
alphanumerics to see which (if any) extension is used and check that
they match as well. For now, just complain loudly, partly because I'm
lazy, partly because I don't know any in-tree code that might use
fmtcheck() with a %p in the template, and I can't really imagine
anyone would use a %pXX extension in a non-constant format string.

I'm making this optional, but default y, since I don't suppose
fmtcheck() will ever appear in a hot path.

The BSDs (and libbsd on linux) contain a fmtcheck() function; I took the
name and return semantics from that.

Signed-off-by: Rasmus Villemoes <linux@xxxxxxxxxxxxxxxxxx>
include/linux/kernel.h | 18 ++++++++++++
lib/Kconfig.debug | 9 ++++++
lib/vsprintf.c | 65 ++++++++++++++++++++++++++++++++++++++++++
3 files changed, 92 insertions(+)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index d6aac75b51ba..8e9154e100c3 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -495,6 +495,24 @@ char *kvasprintf(gfp_t gfp, const char *fmt, va_list args);
extern __printf(2, 0)
const char *kvasprintf_const(gfp_t gfp, const char *fmt, va_list args);

+#define FMTCHECK_SILENT 0x01
+const char *_fmtcheck(const char *fmt, const char *tmpl, unsigned flags);
+static inline __format_arg(2) const char *
+_fmtcheck(const char *fmt, const char *tmpl, unsigned flags)
+ return fmt;
+ * Use of fmtcheck is pointless if the template is not a string
+ * literal, so try to enforce that.
+ */
+#define fmtcheck(fmt, tmpl, flags) _fmtcheck(fmt, "" tmpl "", flags)
extern __scanf(2, 3)
int sscanf(const char *, const char *, ...);
extern __scanf(2, 0)
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 4966c4fbe7f7..adfd431c6876 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1037,6 +1037,15 @@ config DEBUG_PREEMPT
if kernel code uses it in a preemption-unsafe way. Also, the kernel
will detect preemption count underflows.

+config FMTCHECK
+ bool "Runtime format string checking"
+ default y
+ help
+ If you say Y here, the kernel performs runtime sanity checks
+ of non-constant format strings against builtin templates,
+ issuing a warning and using the template as a fallback in
+ case of mismatch.
menu "Lock Debugging (spinlocks, mutexes, etc...)"

diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index d5b3a3f95c01..81b7cda71158 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -3201,3 +3201,68 @@ int sscanf(const char *buf, const char *fmt, ...)
return i;
+static int
+next_interesting_spec(const char **s, struct printf_spec *spec)
+ int len;
+ while (1) {
+ len = format_decode(*s, spec);
+ if (!len)
+ return 0;
+ *s += len;
+ if (spec->type == FORMAT_TYPE_NONE ||
+ continue;
+ return len;
+ }
+const char *
+_fmtcheck(const char *fmt, const char *tmpl, unsigned flags)
+ const char *f = fmt;
+ const char *t = tmpl;
+ struct printf_spec fspec = {0}, tspec = {0};
+ int flen, tlen;
+ int warn = !(flags & FMTCHECK_SILENT);
+ while (1) {
+ flen = next_interesting_spec(&f, &fspec);
+ tlen = next_interesting_spec(&t, &tspec);
+ if (!flen) {
+ /*
+ * The given format string doesn't have any
+ * more specifiers. It's ok from a type-safety
+ * POV for the template to have extra, but
+ * optionally warn about it (e.g., a single %d
+ * may be required).
+ */
+ if (tlen && (flags & FMTCHECK_NO_EXTRA_ARGS) && warn)
+ WARN_ONCE(warn, "template '%s' expects more arguments than '%s'\n",
+ tmpl, fmt);
+ return fmt;
+ }
+ if (!tlen) {
+ WARN_ONCE(warn, "format string '%s' expects more arguments than template '%s'",
+ fmt, tmpl);
+ return tmpl;
+ }
+ WARN_ONCE(warn && (fspec.type == FORMAT_TYPE_PTR || tspec.type == FORMAT_TYPE_PTR),
+ "don't use %%p in non-constant format strings");
+ /*
+ * Should we also care about flags, field width,
+ * precision? Should we even care about base?
+ */
+ if (fspec.type != tspec.type ||
+ fspec.base != tspec.base) {
+ WARN_ONCE(warn, "format string '%s' incompatible with template '%s'",
+ fmt, tmpl);
+ return tmpl;
+ }
+ }