Re: [PATCH] [v2] drm/i915: use static const array for PICK macro

From: Arnd Bergmann
Date: Tue Jan 16 2018 - 11:42:34 EST


On Mon, Dec 11, 2017 at 7:40 PM, Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> wrote:
> Quoting Chris Wilson (2017-12-11 12:51:42)
>> Quoting Arnd Bergmann (2017-12-11 12:46:22)
>> > v2: rebased after a1986f4174a4 ("drm/i915: Remove unnecessary PORT3 definition.")
>> > ---
>> > drivers/gpu/drm/i915/i915_reg.h | 18 +++++++++---------
>> > 1 file changed, 9 insertions(+), 9 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
>> > index 09bf043c1c2e..36f4408503e1 100644
>> > --- a/drivers/gpu/drm/i915/i915_reg.h
>> > +++ b/drivers/gpu/drm/i915/i915_reg.h
>> > @@ -139,7 +139,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
>> > return !i915_mmio_reg_equal(reg, INVALID_MMIO_REG);
>> > }
>> >
>> > -#define _PICK(__index, ...) (((const u32 []){ __VA_ARGS__ })[__index])
>> > +#define _PICK(__index, ...) ({static const u32 __arr[] = { __VA_ARGS__ }; __arr[__index];})
>>
>> Is gcc smart enough for
>> if (__builtin_context_p(__index)) {
>> ((const u32 []){ __VA_ARGS__ })[__index];
>> } else {
>> static const u32 __arr[] = { __VA_ARGS__ };
>> __arr[__index];
>> }
>> ?
>
> Not really, we don't have enough constants for it to make a substantial
> difference:
>
> add/remove: 1/0 grow/shrink: 3/5 up/down: 617/-604 (13)
> Function old new delta
> cnl_ddi_vswing_program.isra - 574 +574
> bxt_ddi_phy_is_enabled 220 241 +21
> bxt_ddi_phy_set_signal_level 537 556 +19
> i9xx_get_pipe_config 1474 1477 +3
> bxt_ddi_phy_verify_state 411 408 -3
> _bxt_ddi_phy_init 956 950 -6
> vlv_display_power_well_init 470 461 -9
> bxt_ddi_pll_get_hw_state 774 762 -12
> cnl_ddi_vswing_sequence 1166 592 -574
> Total: Before=13461532, After=13461545, chg +0.00%
>
> Of particular note the size of __arr[] is not reduced, so gcc is already
> eliminating the static[] for constant index, or not eliminating the
> redundant branch here.

I noticed we never concluded here. Did you see anything wrong with my
workaround in the end or could we just apply it to avoid the stack
size regression?

Arnd