Re: [PATCH 3/6] irqchip/gic-v3-its: Avoid explicit cpumask allocation on stack

From: Marc Zyngier
Date: Sat Apr 13 2024 - 07:11:31 EST


On Sat, 13 Apr 2024 11:29:20 +0100,
Dawei Li <dawei.li@xxxxxxxxxxxx> wrote:
>
> Hi Marc,
>
> Thanks for the review.
>
> On Fri, Apr 12, 2024 at 02:53:32PM +0100, Marc Zyngier wrote:
> > On Fri, 12 Apr 2024 11:58:36 +0100,
> > Dawei Li <dawei.li@xxxxxxxxxxxx> wrote:
> > >
> > > In general it's preferable to avoid placing cpumasks on the stack, as
> > > for large values of NR_CPUS these can consume significant amounts of
> > > stack space and make stack overflows more likely.
> > >
> > > Remove cpumask var on stack and use proper cpumask API to address it.
> >
> > Define proper. Or better, define what is "improper" about the current
> > usage.
>
> Sorry for the confusion.
>
> I didn't mean current implementation is 'improper', actually both
> implementations share equivalent API usages. I will remove this
> misleading expression from commit message.
>
> >
> > >
> > > Signed-off-by: Dawei Li <dawei.li@xxxxxxxxxxxx>
> > > ---
> > > drivers/irqchip/irq-gic-v3-its.c | 9 ++++++---
> > > 1 file changed, 6 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> > > index fca888b36680..a821396c4261 100644
> > > --- a/drivers/irqchip/irq-gic-v3-its.c
> > > +++ b/drivers/irqchip/irq-gic-v3-its.c
> > > @@ -3826,7 +3826,7 @@ static int its_vpe_set_affinity(struct irq_data *d,
> > > bool force)
> > > {
> > > struct its_vpe *vpe = irq_data_get_irq_chip_data(d);
> > > - struct cpumask common, *table_mask;
> > > + struct cpumask *table_mask;
> > > unsigned long flags;
> > > int from, cpu;
> > >
> > > @@ -3850,8 +3850,11 @@ static int its_vpe_set_affinity(struct irq_data *d,
> > > * If we are offered another CPU in the same GICv4.1 ITS
> > > * affinity, pick this one. Otherwise, any CPU will do.
> > > */
> > > - if (table_mask && cpumask_and(&common, mask_val, table_mask))
> > > - cpu = cpumask_test_cpu(from, &common) ? from : cpumask_first(&common);
> > > + if (table_mask && cpumask_intersects(mask_val, table_mask)) {
> > > + cpu = cpumask_test_cpu(from, mask_val) &&
> > > + cpumask_test_cpu(from, table_mask) ?
> > > + from : cpumask_first_and(mask_val, table_mask);
> >
> > So we may end-up computing the AND of the two bitmaps twice (once for
> > cpumask_intersects(), once for cpumask_first_and()), instead of only
> > doing it once.
>
> Actually maybe it's possible to merge these 2 bitmap ops into one:
>
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index fca888b36680..7a267777bd0b 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -3826,7 +3826,8 @@ static int its_vpe_set_affinity(struct irq_data *d,
> bool force)
> {
> struct its_vpe *vpe = irq_data_get_irq_chip_data(d);
> - struct cpumask common, *table_mask;
> + struct cpumask *table_mask;
> + unsigned int common;
> unsigned long flags;
> int from, cpu;
>
> @@ -3850,10 +3851,13 @@ static int its_vpe_set_affinity(struct irq_data *d,
> * If we are offered another CPU in the same GICv4.1 ITS
> * affinity, pick this one. Otherwise, any CPU will do.
> */
> - if (table_mask && cpumask_and(&common, mask_val, table_mask))
> - cpu = cpumask_test_cpu(from, &common) ? from : cpumask_first(&common);
> - else
> + if (table_mask && (common = cpumask_first_and(mask_val, table_mask)) < nr_cpu_ids) {
> + cpu = cpumask_test_cpu(from, mask_val) &&
> + cpumask_test_cpu(from, table_mask) ?
> + from : common;
> + } else {
> cpu = cpumask_first(mask_val);
> + }
>
> >
> > I don't expect that to be horrible, but I also note that you don't
> > even talk about the trade-offs you are choosing to make.
>
> With change above, I assume that the tradeoff is minor and can be ignored?

Yup, this works. My preference would be something which I find
slightly more readable though (avoiding assignment in the
conditional):

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index fca888b36680..299dafc7c0ea 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -3826,9 +3826,9 @@ static int its_vpe_set_affinity(struct irq_data *d,
bool force)
{
struct its_vpe *vpe = irq_data_get_irq_chip_data(d);
- struct cpumask common, *table_mask;
+ struct cpumask *table_mask;
unsigned long flags;
- int from, cpu;
+ int from, cpu = nr_cpu_ids;

/*
* Changing affinity is mega expensive, so let's be as lazy as
@@ -3850,10 +3850,15 @@ static int its_vpe_set_affinity(struct irq_data *d,
* If we are offered another CPU in the same GICv4.1 ITS
* affinity, pick this one. Otherwise, any CPU will do.
*/
- if (table_mask && cpumask_and(&common, mask_val, table_mask))
- cpu = cpumask_test_cpu(from, &common) ? from : cpumask_first(&common);
- else
+ if (table_mask)
+ cpu = cpumask_any_and(mask_val, table_mask);
+ if (cpu < nr_cpu_ids) {
+ if (cpumask_test_cpu(from, mask_val) &&
+ cpumask_test_cpu(from, table_mask))
+ cpu = from;
+ } else {
cpu = cpumask_first(mask_val);
+ }

if (from == cpu)
goto out;

Thanks,

M.

--
Without deviation from the norm, progress is not possible.