RE: [PATCH v9 1/5] cpumask: introduce cpumap_print_list/bitmask_to_buf to support large bitmask and list

From: Song Bao Hua (Barry Song)
Date: Tue Aug 10 2021 - 09:25:08 EST




> -----Original Message-----
> From: Song Bao Hua (Barry Song)
> Sent: Saturday, August 7, 2021 7:39 AM
> To: 'Greg KH' <gregkh@xxxxxxxxxxxxxxxxxxx>
> Cc: andriy.shevchenko@xxxxxxxxxxxxxxx; yury.norov@xxxxxxxxx;
> linux-kernel@xxxxxxxxxxxxxxx; akpm@xxxxxxxxxxxxxxxxxxxx;
> dave.hansen@xxxxxxxxx; linux@xxxxxxxxxxxxxxxxxx; rafael@xxxxxxxxxx;
> rdunlap@xxxxxxxxxxxxx; agordeev@xxxxxxxxxxxxx; sbrivio@xxxxxxxxxx;
> jianpeng.ma@xxxxxxxxx; valentin.schneider@xxxxxxx; peterz@xxxxxxxxxxxxx;
> bristot@xxxxxxxxxx; guodong.xu@xxxxxxxxxx; tangchengchang
> <tangchengchang@xxxxxxxxxx>; Zengtao (B) <prime.zeng@xxxxxxxxxxxxx>;
> yangyicong <yangyicong@xxxxxxxxxx>; tim.c.chen@xxxxxxxxxxxxxxx; Linuxarm
> <linuxarm@xxxxxxxxxx>; tiantao (H) <tiantao6@xxxxxxxxxxxxx>; Jonathan Cameron
> <jonathan.cameron@xxxxxxxxxx>
> Subject: RE: [PATCH v9 1/5] cpumask: introduce
> cpumap_print_list/bitmask_to_buf to support large bitmask and list
>
>
>
> > -----Original Message-----
> > From: Greg KH [mailto:gregkh@xxxxxxxxxxxxxxxxxxx]
> > Sent: Saturday, August 7, 2021 1:22 AM
> > To: Song Bao Hua (Barry Song) <song.bao.hua@xxxxxxxxxxxxx>
> > Cc: andriy.shevchenko@xxxxxxxxxxxxxxx; yury.norov@xxxxxxxxx;
> > linux-kernel@xxxxxxxxxxxxxxx; akpm@xxxxxxxxxxxxxxxxxxxx;
> > dave.hansen@xxxxxxxxx; linux@xxxxxxxxxxxxxxxxxx; rafael@xxxxxxxxxx;
> > rdunlap@xxxxxxxxxxxxx; agordeev@xxxxxxxxxxxxx; sbrivio@xxxxxxxxxx;
> > jianpeng.ma@xxxxxxxxx; valentin.schneider@xxxxxxx; peterz@xxxxxxxxxxxxx;
> > bristot@xxxxxxxxxx; guodong.xu@xxxxxxxxxx; tangchengchang
> > <tangchengchang@xxxxxxxxxx>; Zengtao (B) <prime.zeng@xxxxxxxxxxxxx>;
> > yangyicong <yangyicong@xxxxxxxxxx>; tim.c.chen@xxxxxxxxxxxxxxx; Linuxarm
> > <linuxarm@xxxxxxxxxx>; tiantao (H) <tiantao6@xxxxxxxxxxxxx>; Jonathan
> Cameron
> > <jonathan.cameron@xxxxxxxxxx>
> > Subject: Re: [PATCH v9 1/5] cpumask: introduce
> > cpumap_print_list/bitmask_to_buf to support large bitmask and list
> >
> > On Fri, Aug 06, 2021 at 11:02:47PM +1200, Barry Song wrote:
> > > From: Tian Tao <tiantao6@xxxxxxxxxxxxx>
> > >
> > > The existing cpumap_print_to_pagebuf() is used by cpu topology and other
> > > drivers to export hexadecimal bitmask and decimal list to userspace by
> > > sysfs ABI.
> > >
> > > Right now, those drivers are using a normal attribute for this kind of
> > > ABIs. A normal attribute typically has show entry as below:
> > >
> > > static ssize_t example_dev_show(struct device *dev,
> > > struct device_attribute *attr, char *buf)
> > > {
> > > ...
> > > return cpumap_print_to_pagebuf(true, buf, &pmu_mmdc->cpu);
> > > }
> > > show entry of attribute has no offset and count parameters and this
> > > means the file is limited to one page only.
> > >
> > > cpumap_print_to_pagebuf() API works terribly well for this kind of
> > > normal attribute with buf parameter and without offset, count:
> > >
> > > static inline ssize_t
> > > cpumap_print_to_pagebuf(bool list, char *buf, const struct cpumask *mask)
> > > {
> > > return bitmap_print_to_pagebuf(list, buf, cpumask_bits(mask),
> > > nr_cpu_ids);
> > > }
> > >
> > > The problem is once we have many cpus, we have a chance to make bitmask
> > > or list more than one page. Especially for list, it could be as complex
> > > as 0,3,5,7,9,...... We have no simple way to know it exact size.
> > >
> > > It turns out bin_attribute is a way to break this limit. bin_attribute
> > > has show entry as below:
> > > static ssize_t
> > > example_bin_attribute_show(struct file *filp, struct kobject *kobj,
> > > struct bin_attribute *attr, char *buf,
> > > loff_t offset, size_t count)
> > > {
> > > ...
> > > }
> > >
> > > With the new offset and count parameters, this makes sysfs ABI be able
> > > to support file size more than one page. For example, offset could be
> > > >= 4096.
> > >
> > > This patch introduces cpumap_print_bitmask/list_to_buf() and their bitmap
> > > infrastructure bitmap_print_bitmask/list_to_buf() so that those drivers
> > > can move to bin_attribute to support large bitmask and list. At the same
> > > time, we have to pass those corresponding parameters such as offset, count
> > > from bin_attribute to this new API.
> > >
> > > Signed-off-by: Tian Tao <tiantao6@xxxxxxxxxxxxx>
> > > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> > > Cc: Andy Shevchenko <andriy.shevchenko@xxxxxxxxxxxxxxx>
> > > Cc: Randy Dunlap <rdunlap@xxxxxxxxxxxxx>
> > > Cc: Stefano Brivio <sbrivio@xxxxxxxxxx>
> > > Cc: Alexander Gordeev <agordeev@xxxxxxxxxxxxx>
> > > Cc: "Ma, Jianpeng" <jianpeng.ma@xxxxxxxxx>
> > > Cc: Yury Norov <yury.norov@xxxxxxxxx>
> > > Cc: Valentin Schneider <valentin.schneider@xxxxxxx>
> > > Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> > > Cc: Daniel Bristot de Oliveira <bristot@xxxxxxxxxx>
> > > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> > > Signed-off-by: Barry Song <song.bao.hua@xxxxxxxxxxxxx>
> > > ---
> > > include/linux/bitmap.h | 6 +++
> > > include/linux/cpumask.h | 38 +++++++++++++++
> > > lib/bitmap.c | 103 ++++++++++++++++++++++++++++++++++++++++
> > > 3 files changed, 147 insertions(+)
> > >
> > > diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
> > > index a36cfcec4e77..37f36dad18bd 100644
> > > --- a/include/linux/bitmap.h
> > > +++ b/include/linux/bitmap.h
> > > @@ -227,6 +227,12 @@ unsigned int bitmap_ord_to_pos(const unsigned long
> > *bitmap, unsigned int ord, un
> > > int bitmap_print_to_pagebuf(bool list, char *buf,
> > > const unsigned long *maskp, int nmaskbits);
> > >
> > > +extern int bitmap_print_bitmask_to_buf(char *buf, const unsigned long
> > *maskp,
> > > + int nmaskbits, loff_t off, size_t count);
> > > +
> > > +extern int bitmap_print_list_to_buf(char *buf, const unsigned long *maskp,
> > > + int nmaskbits, loff_t off, size_t count);
> > > +
> >
> > Why are you adding bitmap_print_list_to_buf() when no one uses it in
> > this patch series?
> >
> > Did I miss it somewhere?
>
> Yes. It is used in every patch except the last one from Yury
> which is only extending comment.
>
> drivers/base/topology.c:
> +static ssize_t name##_list_read(struct file *file, struct kobject *kobj, \
> + struct bin_attribute *attr, char *buf, \
> + loff_t off, size_t count) \
> +{ \
> + struct device *dev = kobj_to_dev(kobj); \
> + \
> + return cpumap_print_list_to_buf(buf, topology_##mask(dev->id), \
> + off, count); \
> }
>
>
> drivers/base/node.c:
> +static inline ssize_t cpulist_read(struct file *file, struct kobject *kobj,
> + struct bin_attribute *attr, char *buf,
> + loff_t off, size_t count)
> {
> - ...
> + cpumask_and(mask, cpumask_of_node(node_dev->dev.id), cpu_online_mask);
> + n = cpumap_print_list_to_buf(buf, mask, off, count);
> + free_cpumask_var(mask);
>
> >

And correspondingly, Linux has bitmask and list ABIs for cpus, eg:

root@ubuntu:/sys/devices/system/cpu/cpu0/topology# cat core_siblings
ff
root@ubuntu:/sys/devices/system/cpu/cpu0/topology# cat core_siblings_list
0-7

and for nodes, eg:
root@ubuntu:/sys/devices/system/node/node0# cat cpumap
ff
root@ubuntu:/sys/devices/system/node/node0# cat cpulist
0-7


> > thanks,
> >
> > greg k-h
>

Thanks
Barry