Re: [PATCH] netfilter: xt_hashlimit: avoid 64-bit division
From: Pablo Neira Ayuso
Date: Thu Sep 07 2017 - 06:20:05 EST
On Wed, Sep 06, 2017 at 10:48:22PM +0200, Arnd Bergmann wrote:
> On Wed, Sep 6, 2017 at 10:22 PM, Vishwanath Pai <vpai@xxxxxxxxxx> wrote:
> > On 09/06/2017 03:57 PM, Arnd Bergmann wrote:
> >> 64-bit division is expensive on 32-bit architectures, and
> >> requires a special function call to avoid a link error like:
> >>
> >> net/netfilter/xt_hashlimit.o: In function `hashlimit_mt_common':
> >> xt_hashlimit.c:(.text+0x1328): undefined reference to `__aeabi_uldivmod'
> >>
> >> In the case of hashlimit_mt_common, we don't actually need a
> >> 64-bit operation, we can simply rewrite the function slightly
> >> to make that clear to the compiler.
> >>
> >> Fixes: bea74641e378 ("netfilter: xt_hashlimit: add rate match mode")
> >> Signed-off-by: Arnd Bergmann <arnd@xxxxxxxx>
> >> ---
> >> net/netfilter/xt_hashlimit.c | 5 ++++-
> >> 1 file changed, 4 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/net/netfilter/xt_hashlimit.c b/net/netfilter/xt_hashlimit.c
> >> index 10d48234f5f4..50b53d86eef5 100644
> >> --- a/net/netfilter/xt_hashlimit.c
> >> +++ b/net/netfilter/xt_hashlimit.c
> >> @@ -531,7 +531,10 @@ static u64 user2rate_bytes(u64 user)
> >> {
> >> u64 r;
> >>
> >> - r = user ? 0xFFFFFFFFULL / user : 0xFFFFFFFFULL;
> >> + if (user > 0xFFFFFFFFULL)
> >> + return 0;
> >> +
> >> + r = user ? 0xFFFFFFFFULL / (u32)user : 0xFFFFFFFFULL;
> >> r = (r - 1) << 4;
> >> return r;
> >> }
> >>
> >
> > I have submitted another patch to fix this:
> > https://patchwork.ozlabs.org/patch/809881/
> >
> > We have seen this problem before, I was careful not to introduce this
> > again in the new patch but clearly I overlooked this particular line :(
> >
> > In the other cases we fixed it by replacing division with div64_u64().
>
> div64_u64() seems needlessly expensive here since the dividend
> is known to be a 32-bit number. I guess the function is not called
> frequently though, so it doesn't matter much.
This is called from the packet path, only for the first packet for
each new destination IP entry in the hashtable, still from the
datapath. So if we can take something faster (for 32 bit arches) that
is correct, I think it's sensible to take.
Let me know in any case.