Re: [PATCH 1/2] bitmap: new bitmap_copy_safe and bitmap_{from,to}_arr32
From: Yury Norov
Date: Thu Jan 04 2018 - 12:44:07 EST
Hi Andy,
Thanks for review. Comments inline.
On Sun, Dec 31, 2017 at 02:34:42PM +0200, Andy Shevchenko wrote:
> On Thu, Dec 28, 2017 at 5:00 PM, Yury Norov <ynorov@xxxxxxxxxxxxxxxxxx> wrote:
> > This patchset replaces bitmap_{to,from}_u32array with more simple
> > and standard looking copy-like functions.
> >
> > bitmap_from_u32array() takes 4 arguments (bitmap_to_u32array is similar):
> > - unsigned long *bitmap, which is destination;
> > - unsigned int nbits, the length of destination bitmap, in bits;
> > - const u32 *buf, the source; and
> > - unsigned int nwords, the length of source buffer in ints.
> >
> > In description to the function it is detailed like:
> > * copy min(nbits, 32*nwords) bits from @buf to @bitmap, remaining
> > * bits between nword and nbits in @bitmap (if any) are cleared.
> >
> > Having two size arguments looks unneeded and potentially dangerous.
>
> For the first argument, it depends what logic we would like to put behind.
> Imagine the case (and since these functions are targetting some wider
> cases) when you have not aligned bitmap (nbits % BITS_PER_LONG != 0).
>
> So, there are 2 cases, nwords > nbits / BITS_PER_U32, or nbits /
> BITS_PER_U32 > nwords.
>
> We have at least two options for the first one:
> 1) cut it to fit and return some code (or updated nbits, or ...) to
> tell this to the caller;
> 2) return an error and do nothing.
>
> For the second case one:
> 1) merge bitmaps;
> 2) fill with 0 or 1 (another parameter?) the rest of bitmap.
This is the whole point of the patch.
Kernel users doesn't need all that functionality to manage the case
nwords != nbits / BITS_PER_U32. All existing users explicitly pass
nwords and nbits matched.
Support for unmatched nwords and nbits is pretty tricky. As you
mentioned here, there are 2 cases with (at least) 2 options for each
case. Existing code doesn't take all that into account, and if we go
handle it properly, we'll end up with quite big portion of code, which
we should also cover with tests, carefully comment and maintain. And
all this will be for nothing because there's no *real* users of that
functionality.
> > It is unneeded because normally user of copy-like function should
> > take care of the size of destination and make it big enough to fit
> > source data.
> >
> > And it is dangerous because function may hide possible error if user
> > doesn't provide big enough bitmap, and data becomes silently dropped.
>
> We might return -E2BIG, for example.
>
> > That's why all copy-like functions have 1 argument for size of copying
> > data, and I don't see any reason to make bitmap_from_u32array()
> > different.
> >
> > One exception that comes in mind is strncpy() which also provides size
> > of destination in arguments, but it's strongly argued by the possibility
> > of taking broken strings in source. This is not the case of
> > bitmap_{from,to}_u32array().
> >
> > There is no many real users of bitmap_{from,to}_u32array(), and they all
> > very clearly provide size of destination matched with the size of
> > source, so additional functionality is not used in fact. Like this:
> > bitmap_from_u32array(to->link_modes.supported,
> > __ETHTOOL_LINK_MODE_MASK_NBITS,
> > link_usettings.link_modes.supported,
> > __ETHTOOL_LINK_MODE_MASK_NU32);
> > Where:
> > #define __ETHTOOL_LINK_MODE_MASK_NU32 \
> > DIV_ROUND_UP(__ETHTOOL_LINK_MODE_MASK_NBITS, 32)
>
> Consider more generic use of them.
>
> For example, we have a big enough bitmap, but would like to copy only
> few u32 items to it. Another case, we have quite big u32 array, but
> would like to copy only some of them.
> It is first what came to my mind, there are might be much more
> interesting corner cases.
>
> Will it survive?
For your examples, I think, we already have a set of suitable functions
in lib/bitmap.c. *arr32 functions are only to convert bitmaps from/to
u32 arrays with different endianness, probably taken from userspace or
some hardware.
So, for your 1st example, function may look like this:
void bitmap_insert(hugemap, offset, smallmap, nbits)
{
unsigned long first = offset / BITS_PER_LONG;
unsigned long off = offset % BITS_PER_LONG;
unsigned long last = first + (nbits + off) / BITS_PER_LONG;
unsigned long first_smallmap = hugemap[first]
& BITMAP_FIRST_WORD_MASK(offset);
unsigned long last_smallmap = hugemap[first]
& BITMAP_LAST_WORD_MASK(nbits);
bitmap_zero(tmp, nbits + off);
__bitmap_shift_right(tmp, off);
tmp[0] &= hugemap[first];
tmp[(nbits + off) / BITS_PER_LONG] &= hugemap[last];
bitmap_copy(hugemap[first], smallmap, nbits);
}
And usage:
DECLARE_BITMAP(tmp, nbits + off);
bitmap_from_arr32(tmp, arr32, nbits);
bitmap_insert(hugemap, offset, tmp, nbits)
This is pretty artificial example. If there'll be real users of
bitmap_insert, I believe implementation will be different. But
proposed API would work fine for the case. And I don't like the
idea to join functions that implement internal logic with interface
functions that do from/to conversion.
> > In this patch, bitmap_copy_safe and bitmap_{from,to}_arr32 are introduced.
> >
> > 'Safe' in bitmap_copy_safe() stands for clearing unused bits in bitmap
> > beyond last bit till the end of last word. It is useful for hardening
> > API when bitmap is assumed to be exposed to userspace.
>
> Why not setting them? How do you know it's used in positive manner and
> not negative (inverted)?
They are not used at all normally. So any value is OK except one that
represents kernel state. Consider an attack where kernel is requested
to create 1-bit-length bitmap on stack and return it to user. If we
don't clear tail 31 or 63 bits, attacker get it. I don't know any
real example, but original functions take care of it, so I do.
> Another thing, it shouldn't be different by intuition how it behaves
> against bitmap, either both magle it, or none. Or just choose another
> (better) name. _safe sounds a bit confusing to me.
I picked term from implementation of list. There 'safe' stands for
safety against entry deletion. And here it was assumed safety against
kernel data expose. Any better name is welcome.
> > bitmap_{from,to}_arr32 functions are replacements for
> > bitmap_{from,to}_u32array. They don't take unneeded nwords argument, and
> > so simpler in implementation and understanding.
>
> Again, depends what logic we would like to have.
> I can imagine the case when you have nbits defined as macro/constant
> and wouldn't like to change over the code. In your case you
> semantically substitute nbits of bitmap, by nbits of amount of data to
> copy.
>
> For me slightly cleaner to have full size of the destination buffer at
> input rather than arbitrary number which might not prevent crashes in
> some cases.
>
> Consider snprintf() as example.
In comment to the patch I mentioned strncpy() function, but it's also true
for snprintf():
One exception that comes in mind is strncpy() which also provides size
of destination in arguments, but it's strongly argued by the possibility
of taking broken strings in source. This is not the case of
bitmap_{from,to}_u32array().
> > This patch suggests optimization for 32-bit systems - aliasing
> > bitmap_{from,to}_arr32 to bitmap_copy_safe.
>
> So, you preventing *merge* by this. Not sure if it's desired.
Sorry, I don't understand it. Could you elaborate?
> > Other possible optimization is aliasing 64-bit LE bitmap_{from,to}_arr32 to
> > more generic function(s). But I didn't end up with the function that would
> > be helpful by itself, and can be used to alias 64-bit LE
> > bitmap_{from,to}_arr32, like bitmap_copy_safe() does. So I preferred to
> > leave things as is.
> >
> > The following patch switches kernel to new API and introduces test for it.
> >
> > Discussion is here.
> > https://lkml.org/lkml/2017/11/15/592
>
> Unfortunately didn't find any _discussion_ there...
> Wasn't it elsewhere?
I was going to write 'RFC' but wrote 'discussion' for some reason.
There's no much discussion. Matthew Wilcox approved the idea, and
Andrew Morton took this series into his tree, which I also assume
means approve.
Yury