[PATCH v3] sscanf: implement basic character sets

From: Jessica Yu
Date: Tue Feb 23 2016 - 15:38:41 EST


Implement basic character sets for the '%[]' conversion specifier.

The '%[]' conversion specifier matches a nonempty sequence of characters
from the specified set of accepted (or with '^', rejected) characters
between the brackets. The substring matched is to be made up of characters
in (or not in) the set. This implementation differs from its glibc
counterpart in that it does not support character ranges (e.g., 'a-z' or
'0-9'), the hyphen '-' is *not* a special character, and the brackets
themselves cannot be matched.

Signed-off-by: Jessica Yu <jeyu@xxxxxxxxxx>
---

This patch adds support for the '%[' conversion specifier for sscanf().
This is useful in cases where we'd like to match substrings delimited by
something other than spaces. The original motivation for this patch
actually came from a livepatch discussion (See: https://lkml.org/lkml/2016/2/8/790),
where we were trying to come up with a clean way to parse symbol names with
substrings delimited by periods and commas.

Patch based on linux-next-20160223.

v3:
- Fix memory leak in error path (kfree() before returning)
- Remove redundant condition in while loop
- Style fix (*op)() -> op()

v2:
- Use kstrndup() to copy the character set from fmt instead of using a
statically allocated array

lib/vsprintf.c | 41 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 41 insertions(+)

diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 525c8e1..983358a 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -2714,6 +2714,47 @@ int vsscanf(const char *buf, const char *fmt, va_list args)
num++;
}
continue;
+ case '[':
+ {
+ char *s = (char *)va_arg(args, char *);
+ char *set;
+ size_t (*op)(const char *str, const char *set);
+ size_t len = 0;
+ bool negate = (*(fmt) == '^');
+
+ if (field_width == -1)
+ field_width = SHRT_MAX;
+
+ op = negate ? &strcspn : &strspn;
+ if (negate)
+ fmt++;
+
+ len = strcspn(fmt, "]");
+ /* invalid format; stop here */
+ if (!len)
+ return num;
+
+ set = kstrndup(fmt, len, GFP_KERNEL);
+ if (!set)
+ return num;
+
+ /* advance fmt past ']' */
+ fmt += len + 1;
+
+ len = op(str, set);
+ /* no matches */
+ if (!len) {
+ kfree(set);
+ return num;
+ }
+
+ while (len-- && field_width--)
+ *s++ = *str++;
+ *s = '\0';
+ kfree(set);
+ num++;
+ }
+ continue;
case 'o':
base = 8;
break;
--
2.4.3