Re: [PATCH] cpumask: Omit terminating null byte in cpumap_print_{list,bitmask}_to_buf

From: Tobias Klauser
Date: Fri Sep 17 2021 - 04:45:50 EST


On 2021-09-17 at 01:19:04 +0200, Yury Norov <yury.norov@xxxxxxxxx> wrote:
> [CC Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx>]
>
> On Thu, Sep 16, 2021 at 10:53:39PM +0000, Song Bao Hua (Barry Song) wrote:
> >
> >
> > > -----Original Message-----
> > > From: Tobias Klauser [mailto:tklauser@xxxxxxxxxx]
> > > Sent: Friday, September 17, 2021 10:27 AM
> > > To: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>; Jonathan Cameron
> > > <jonathan.cameron@xxxxxxxxxx>; tiantao (H) <tiantao6@xxxxxxxxxxxxx>; Song Bao
> > > Hua (Barry Song) <song.bao.hua@xxxxxxxxxxxxx>
> > > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>; Andy Shevchenko
> > > <andriy.shevchenko@xxxxxxxxxxxxxxx>; Yury Norov <yury.norov@xxxxxxxxx>; Peter
> > > Zijlstra <peterz@xxxxxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx
> > > Subject: [PATCH] cpumask: Omit terminating null byte in
> > > cpumap_print_{list,bitmask}_to_buf
> > >
> > > The changes in the patch series [1] introduced a terminating null byte
> > > when reading from cpulist or cpumap sysfs files, for example:
> > >
> > > $ xxd /sys/devices/system/node/node0/cpulist
> > > 00000000: 302d 310a 00 0-1..
> > >
> > > Before this change, the output looked as follows:
> > >
> > > $ xxd /sys/devices/system/node/node0/cpulist
> > > 00000000: 302d 310a 0-1.
> >
> > If we don't use xxd, I don't see any actual harm of this NULL byte
> > by cat, lscpu, numactl etc. this doesn't break them at all.
>
> Barry, Tobias' script that uses xxd is userspace. Linux kernel never breaks
> userspace.

FWIW, the example using xxd was just to illustrate the issue in a
concise way for the commit message. This is breaking other userspace
programs as well. Originally, I discovered this because Kubernetes'
kubelet was crashing on a bpf-next kernel. See [1] and following
comments for more information:

[1] https://github.com/cilium/cilium/pull/17394#issuecomment-920902042

> > if we only want to make sure the output is exactly same with before
> > for every single character, this patch is right.
>
> We don't want to make the output exactly the same. The "0,1" would
> also work for the example above. But garbage characters following \0
> is a bug that should be fixed.

I think we also want to avoid the \0 itself, which is what this patch
does and is in line with previous behavior. It also looks like all other
sysfs files in that subtree expose the same content format (i.e. \n is
the last character, not \0).

Thanks,
Tobias