Re: [PATCH] lib/vsprintf: add %pT[C012] format specifier

From: Joe Perches
Date: Mon Dec 30 2013 - 11:56:24 EST


On Sun, 2013-12-29 at 21:13 +0900, Tetsuo Handa wrote:
> Tetsuo Handa wrote:
> > Joe Perches wrote:
> > > On Sat, 2013-12-28 at 12:08 -0800, Andrew Morton wrote:
> > > > On Sat, 28 Dec 2013 11:53:25 -0800 Joe Perches <joe@xxxxxxxxxxx> wrote:
> > > >
> > > > > > > #define PRINTK_PID "\002"
> > > > > > > #define PRINTK_TASK_ID "\003" /* "comm:pid" */
> > >
> > > > > > >
> > > > > > > printk(PRINTK_TASK_ID ": hair on fire\n");
> > > > > > >
> > > > > > > It's certainly compact. I doubt if there's any existing code which
> > > > > > > deliberately prints control chars?
> >
> > What about using bytes from \x7F to \xFF ?
>
> Here is a draft patch. With this approach, we can embed whatever variables
> vsnprintf() can access (not only current thread's attributes seen from init and
> current namespace but also other variables like current time) into the format
> string without passing corresponding argument.
>
> Is this approach acceptable? (Of course, I'll add lines like
>
> #define PRINT_FORMAT "\x7F"
> #define PRINT_TASK_COMM "\x80"
> #define PRINT_TASK_PID "\x81"
>
> for final version.) What variables (e.g. current->comm, current->pid,
> task_tgid_vnr(current)) do we want to pass using builtin tokens?
[]
> The vast majority of ->comm accesses are accessing current->comm, for
> debug reasons. I'm counting 350-odd sites. At all these sites it's
> pointless passing `current' to the printk function at all!

I get:

$ grep-2.5.4 -rP --include=*.[ch] -oh \
"\b(?:printk|[a-z_]+_(?:printk|emerg|alert|crit|err|warn|notice|info|debug|dbg))[^;]*\bcurrent->[\w_]+" * | \
grep -P -oh "\bcurrent->[\w_]+"| sort | uniq -c | sort -rn
380 current->pid
267 current->comm
34 current->mm
25 current->thread
17 current->journal_info
8 current->flags
7 current->tgid
5 current->active_mm
2 current->sas_ss_sp
2 current->addr_limit
1 current->uid
1 current->signal
1 current->sas_ss_size
1 current->ret_stack
1 current->memcg_batch
1 current->kernel_stack_page
1 current->group_leader
1 current->exit_code
1 current->cred
1 current->blocked

> This patch introduces a set of vsprintf tokens for embedding globally visible
> arguments into the format string, so that we can reduce text size a bit (and
> in the future make reading of task_struct->comm consistent) by stop generating
> the code to pass an argument which the callee already has access to, with an
> assumption that we can avoid using bytes >= 0x7F within the format string.
>
> Some examples for converting direct ->comm users are shown below.
>
> pr_info("comm=%s\n", current->comm); => pr_info("comm=\x80\n");
> pr_info("pid=%d\n", current->pid); => pr_info("pid=\x81\n");
> pr_info("%s(%d)\n", current->comm, current->pid); => pr_info("\x80(\x81)\n");
> pr_info("[%-6.6s]\n", current->comm); => pr_info("\x7F-6.6\x80\n");

I too prefer using the #define CURRENT_<FOO> string approach
rather than direct embedding of string.

pr_info("comm=" CURRENT_COMM "\n");

Also I prefer using ASCII SUB (26 \x1a \032 ^z) or maybe
PU1 - 145 or PU2 - 146, as an initiator byte as it takes
up much less of the control word space instead of using
multiple values like \x80, \x81, \x82, etc. Using an
initiator byte seems more extensible too.

http://en.wikipedia.org/wiki/C0_and_C1_control_codes


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/