RE: [PATCH v7 1/4] cpumask: introduce cpumap_print_to_buf to support large bitmask and list

From: Song Bao Hua (Barry Song)
Date: Thu Jul 15 2021 - 17:08:52 EST




> -----Original Message-----
> From: Yury Norov [mailto:yury.norov@xxxxxxxxx]
> Sent: Friday, July 16, 2021 3:29 AM
> To: Song Bao Hua (Barry Song) <song.bao.hua@xxxxxxxxxxxxx>
> Cc: gregkh@xxxxxxxxxxxxxxxxxxx; akpm@xxxxxxxxxxxxxxxxxxxx;
> andriy.shevchenko@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> dave.hansen@xxxxxxxxx; linux@xxxxxxxxxxxxxxxxxx; rafael@xxxxxxxxxx;
> rdunlap@xxxxxxxxxxxxx; agordeev@xxxxxxxxxxxxx; sbrivio@xxxxxxxxxx;
> jianpeng.ma@xxxxxxxxx; valentin.schneider@xxxxxxx; peterz@xxxxxxxxxxxxx;
> bristot@xxxxxxxxxx; guodong.xu@xxxxxxxxxx; tangchengchang
> <tangchengchang@xxxxxxxxxx>; Zengtao (B) <prime.zeng@xxxxxxxxxxxxx>;
> yangyicong <yangyicong@xxxxxxxxxx>; tim.c.chen@xxxxxxxxxxxxxxx; Linuxarm
> <linuxarm@xxxxxxxxxx>; tiantao (H) <tiantao6@xxxxxxxxxxxxx>
> Subject: Re: [PATCH v7 1/4] cpumask: introduce cpumap_print_to_buf to support
> large bitmask and list
>
> On Thu, Jul 15, 2021 at 11:58:53PM +1200, Barry Song wrote:
> > (10.1.198.147)
> > X-CFilter-Loop: Reflected
> > Status: O
> > Content-Length: 10263
> > Lines: 252
> >
> > From: Tian Tao <tiantao6@xxxxxxxxxxxxx>
>
> [...]
>
> > +int bitmap_print_to_buf(bool list, char *buf, const unsigned long *maskp,
> > + int nmaskbits, loff_t off, size_t count)
> > +{
> > + const char *fmt = list ? "%*pbl\n" : "%*pb\n";
> > + ssize_t size;
> > + void *data;
> > +
> > + data = kasprintf(GFP_KERNEL, fmt, nmaskbits, maskp);
> > + if (!data)
> > + return -ENOMEM;
> > +
> > + size = memory_read_from_buffer(buf, count, &off, data, strlen(data) + 1);
> > + kfree(data);
> > +
> > + return size;
> > +}
> > +EXPORT_SYMBOL(bitmap_print_to_buf);
>
> In discussion to v4 of this series I've pointed out inefficiency of
> this approach. Now it's v7, but the problem is still there.
>
> 1. You make user of your API guess aboout proper @count without any
> hint. This is worse than how it works now with pure vsnprintf().

This isn't true. While this count comes from sysfs bin_attribute,
sysfs show() entry guarantee the count is proper and inside the
valid range of the buffer. Otherwise, sysfs bin_attribute has totally
broken for all users.

> 2. For big bitmaps and small @count, your code will make enormous
> amount of unneeded work. For example, if the full length of string
> representation of bitmap is 1M, and length of the output buffer is
> 1k, one will have to call bitmap_print_to_buf() 1000 times. With
> current design it assumes that you allocate the full amount of memory
> 1000 times, free it 1000 times and print huge bitmap 1000 times to
> just return small part of it.

This isn't true either. Nobody is actually holding a cpumap like 1MB.
4KB has been used in current kernel for a long time, no machine
has really complained it is not enough. So I would expect the real
case would be one time for majority, perhaps twice for some machines
which we haven't seen yet.

>
> NAK for this, and please stop submitting wrong approach again and
> again.

I have always answered your email and explained with a lot of word,
but you totally ignored my explanation and didn't even answer my
explanation in v5 and v6. That seems quite unfair.

Considering a driver which has M cpus and N different topology
entries in its show entry:

example_bin_attribute_show(struct file *filp, struct kobject *kobj,
struct bin_attribute *attr, char *buf,
loff_t offset, size_t count)
{
...
}
In case what you say is true and this show() is called 1000 times
with different offset if the buffer is as big as 1MB.
How would the code work by reusing a buffer allocated in advance,
like below?

example_bin_attribute_show(struct file *filp, struct kobject *kobj,
struct bin_attribute *attr, char *buf,
loff_t offset, size_t count)
{
//1st time:
char *bitmap_buf = bitmap_buffer_allocate(....);
save bitmap_buf to somewhere?

//2nd~1000 time
reuse the bitmap_buf?

//1000time
Free the bitmap buf?
}

Or like below?

char *global_bitmap_buf = bitmap_buffer_allocate(....)?

example_bin_attribute_show(struct file *filp, struct kobject *kobj,
struct bin_attribute *attr, char *buf,
loff_t offset, size_t count)
{
//1st - 1000 time?
Reuse the global_bitmap_buf?
}

Neither of the above way is good to me. The 1st one is not doable
at all. The second one is not doable either. M*N entries will have
different size and the buffer being re-used seems to be wasting
memory.

I'd appreciate if you could post some pseudo code so that v8 could
make some actual difference and go to the way you prefer.
On the other hand, we also need Greg's Ack on driver changes which
might happen afterwards.

So please post some pseudo code rather than simply put a NAK
by ignoring my long explanation.

Thanks
Barry