Re: [PATCH 4/5] nvme-pci: simplify nvme_setup_irqs() via .setup_affinity callback

From: Thomas Gleixner
Date: Sun Feb 10 2019 - 11:39:35 EST


On Fri, 25 Jan 2019, Ming Lei wrote:

> Use the callback of .setup_affinity() to re-caculate number
> of queues, and build irqs affinity with help of irq_build_affinity().
>
> Then nvme_setup_irqs() gets simplified a lot.

I'm pretty sure you can achieve the same by reworking the core code without
that callback.

> + /* Fill out vectors at the beginning that don't need affinity */
> + for (curvec = 0; curvec < affd->pre_vectors; curvec++)
> + cpumask_copy(&masks[curvec].mask, cpu_possible_mask);

cpu_possible_mask is wrong. Why are you deliberately trying to make this
special? There is absolutely no reason to do so.

These interrupts are not managed and therefore the initial affinity has to
be irq_default_affinity. Setting them to cpu_possible_mask can and probably
will evade a well thought out default affinity mask, which was set to
isolate a set of cores from general purpose interrupts.

This is exactly the thing which happens with driver special stuff and which
needs to be avoided. There is nothing special about this NVME setup and
yes, I can see why the current core code is a bit tedious to work with, but
that does not justify that extra driver magic by any means.

Thanks,

tglx