Re: [PATCH] drivers/base/node.c: fix userspace break from using bin_attributes for cpumap and cpulist

From: Greg Kroah-Hartman
Date: Wed Jul 13 2022 - 09:06:04 EST


On Wed, Jul 13, 2022 at 07:47:58AM -0400, Phil Auld wrote:
> Hi Greg,
>
> On Wed, Jul 13, 2022 at 08:06:02AM +0200 Greg Kroah-Hartman wrote:
> > On Tue, Jul 12, 2022 at 05:43:01PM -0400, Phil Auld wrote:
> > > Using bin_attributes with a 0 size causes fstat and friends to return that 0 size.
> > > This breaks userspace code that retrieves the size before reading the file. Rather
> > > than reverting 75bd50fa841 ("drivers/base/node.c: use bin_attribute to break the size
> > > limitation of cpumap ABI") let's put in a size value at compile time. Use direct
> > > comparison and a worst-case maximum to ensure compile time constants. For cpulist the
> > > max is on the order of NR_CPUS * (ceil(log10(NR_CPUS)) + 1) which for 8192 is 40960.
> > > In order to get near that you'd need a system with every other CPU on one node or
> > > something similar. e.g. (0,2,4,... 1024,1026...). We set it to a min of PAGE_SIZE
> > > to retain the older behavior. For cpumap, PAGE_SIZE is plenty big.
> >
> > Does userspace care about that size, or can we just put any value in
> > there and it will be ok? How about just returning to the original
> > PAGE_SIZE value to keep things looking identical, will userspace not
> > read more than that size from the file then?
> >
>
> I'll go look. But I think the point of pre-reading the size with fstat is to allocate
> a buffer to read into. So that may be a problem.
>
> That said, I believe in this case it's the cpulist file which given the use of ranges
> is very unlikely to actually get that big.

That is why we had to change this to a binary file. Think about
every-other CPU being there, that's a huge list. This already was
broken on some systems which is why it had to be changed (i.e. we didn't
change it for no reason at all.)

> > > On an 80 cpu 4-node sytem (NR_CPUS == 8192)
> >
> > We have systems running Linux with many more cpus than that, and your
> > company knows this :)
>
> The 80 cpus here don't matter and we only build with NR_CPUS = 8192 :)
>
> But yes, I realize now that the cpumap part I posted is broken for larger
> NR_CPUS. I originally had it as NR_CPUS, but as I said in my reply to Barry,
> it wants to be ~= NR_CPUS/4 + NR_CPUS/32. I'll change that.
>
> I think we should decide on a max for each and use that.

Sure, pick a max size please, that's fine with me.

greg k-h