Re: [PATCH hotfix 6.11] minmax: reduce egregious min/max macro expansion
From: Hans de Goede
Date: Wed Sep 11 2024 - 12:25:08 EST
Hi Lorenzo,
On 9/11/24 5:34 PM, Lorenzo Stoakes wrote:
> Avoid nested min()/max() which results in egregious macro expansion.
>
> This issue was introduced by commit 867046cc7027 ("minmax: relax check to
> allow comparison between unsigned arguments and signed constants") [2].
>
> Work has been done to address the issue of egregious min()/max() macro
> expansion in commit 22f546873149 ("minmax: improve macro expansion and type
> checking") and related, however it appears that some issues remain on more
> tightly constrained systems.
>
> Adjust a few known-bad cases of deeply nested macros to avoid doing so to
> mitigate this. Porting the patch first proposed in [1] to Linus's tree.
>
> Running an allmodconfig build using the methodology described in [2] we
> observe a 35 MiB reduction in generated code.
>
> The difference is much more significant prior to recent minmax fixes which
> were not backported. As per [1] prior these the reduction is more like 200
> MiB.
>
> This resolves an issue with slackware 15.0 32-bit compilation as reported
> by Richard Narron.
>
> Presumably the min/max fixups would be difficult to backport, this patch
> should be easier and fix's Richard's problem in 5.15.
>
> [0]:https://lore.kernel.org/all/b97faef60ad24922b530241c5d7c933c@xxxxxxxxxxxxxxxx/
> [1]:https://lore.kernel.org/lkml/5882b96e-1287-4390-8174-3316d39038ef@lucifer.local/
> [2]:https://lore.kernel.org/linux-mm/36aa2cad-1db1-4abf-8dd2-fb20484aabc3@lucifer.local/
>
> Reported-by: Richard Narron <richard@xxxxxxxxxx>
> Closes: https://lore.kernel.org/all/4a5321bd-b1f-1832-f0c-cea8694dc5aa@xxxxxxxxxx/
> Fixes: 867046cc7027 ("minmax: relax check to allow comparison between unsigned arguments and signed constants")
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx>
Thank you for your patch.
I must say that I'm not a fan of that this is patching 3 totally
unrelated files here in a single patch.
This is e.g. going to be a problem if we need to revert one of
the changes because of regressions...
So I would prefer this to be split into 3 patches.
One review comment for the atomisp bits inline / below.
> ---
> drivers/net/ethernet/marvell/mvpp2/mvpp2.h | 2 +-
> .../staging/media/atomisp/pci/sh_css_frac.h | 26 ++++++++++++++-----
> include/linux/skbuff.h | 6 ++++-
> 3 files changed, 25 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2.h b/drivers/net/ethernet/marvell/mvpp2/mvpp2.h
> index e809f91c08fb..8b431f90efc3 100644
> --- a/drivers/net/ethernet/marvell/mvpp2/mvpp2.h
> +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2.h
> @@ -23,7 +23,7 @@
> /* The PacketOffset field is measured in units of 32 bytes and is 3 bits wide,
> * so the maximum offset is 7 * 32 = 224
> */
> -#define MVPP2_SKB_HEADROOM min(max(XDP_PACKET_HEADROOM, NET_SKB_PAD), 224)
> +#define MVPP2_SKB_HEADROOM clamp_t(int, XDP_PACKET_HEADROOM, NET_SKB_PAD, 224)
>
> #define MVPP2_XDP_PASS 0
> #define MVPP2_XDP_DROPPED BIT(0)
> diff --git a/drivers/staging/media/atomisp/pci/sh_css_frac.h b/drivers/staging/media/atomisp/pci/sh_css_frac.h
> index b90b5b330dfa..a973394c5bc0 100644
> --- a/drivers/staging/media/atomisp/pci/sh_css_frac.h
> +++ b/drivers/staging/media/atomisp/pci/sh_css_frac.h
> @@ -32,12 +32,24 @@
> #define uISP_VAL_MAX ((unsigned int)((1 << uISP_REG_BIT) - 1))
>
> /* a:fraction bits for 16bit precision, b:fraction bits for ISP precision */
> -#define sDIGIT_FITTING(v, a, b) \
> - min_t(int, max_t(int, (((v) >> sSHIFT) >> max(sFRACTION_BITS_FITTING(a) - (b), 0)), \
> - sISP_VAL_MIN), sISP_VAL_MAX)
> -#define uDIGIT_FITTING(v, a, b) \
> - min((unsigned int)max((unsigned)(((v) >> uSHIFT) \
> - >> max((int)(uFRACTION_BITS_FITTING(a) - (b)), 0)), \
> - uISP_VAL_MIN), uISP_VAL_MAX)
> +static inline int sDIGIT_FITTING(short v, int a, int b)
> +{
drivers/staging/media/atomisp/pci/isp/kernels/s3a/s3a_1.0/ia_css_s3a.host.c
calls this with ia_css_3a_config.af_fir1_coef / .af_fir2_coef
as first argument those are of the ia_css_s0_15 type which is:
/* Signed fixed point value, 0 integer bits, 15 fractional bits */
typedef s32 ia_css_s0_15;
please replace the "short v" with "int v"
I think that you can then also replace clamp_t() with clamp()
> + int fit_shift = sFRACTION_BITS_FITTING(a) - b;
> +
> + v >>= sSHIFT;
> + v >>= fit_shift > 0 ? fit_shift : 0;
> +
> + return clamp_t(int, v, sISP_VAL_MIN, sISP_VAL_MAX);
> +}
> +
> +static inline unsigned int uDIGIT_FITTING(unsigned int v, int a, int b)
> +{
> + int fit_shift = uFRACTION_BITS_FITTING(a) - b;
> +
> + v >>= uSHIFT;
> + v >>= fit_shift > 0 ? fit_shift : 0;
> +
> + return clamp_t(unsigned int, v, uISP_VAL_MIN, uISP_VAL_MAX);
> +}
Regular clamp() should work here ? all parameters are already
unsigned ints.
Regards,
Hans
>
> #endif /* __SH_CSS_FRAC_H */
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 29c3ea5b6e93..d53b296df504 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -3164,7 +3164,11 @@ static inline int pskb_network_may_pull(struct sk_buff *skb, unsigned int len)
> * NET_IP_ALIGN(2) + ethernet_header(14) + IP_header(20/40) + ports(8)
> */
> #ifndef NET_SKB_PAD
> -#define NET_SKB_PAD max(32, L1_CACHE_BYTES)
> +#if L1_CACHE_BYTES < 32
> +#define NET_SKB_PAD 32
> +#else
> +#define NET_SKB_PAD L1_CACHE_BYTES
> +#endif
> #endif
>
> int ___pskb_trim(struct sk_buff *skb, unsigned int len);
> --
> 2.46.0
>