Re: [PATCH 4/5] nvme-pci: simplify nvme_setup_irqs() via .setup_affinity callback

From: Ming Lei
Date: Sun Feb 10 2019 - 22:58:28 EST


On Sun, Feb 10, 2019 at 05:39:20PM +0100, Thomas Gleixner wrote:
> On Fri, 25 Jan 2019, Ming Lei wrote:
>
> > Use the callback of .setup_affinity() to re-caculate number
> > of queues, and build irqs affinity with help of irq_build_affinity().
> >
> > Then nvme_setup_irqs() gets simplified a lot.
>
> I'm pretty sure you can achieve the same by reworking the core code without
> that callback.

Could you share the idea a bit? As I mentioned, the re-distribution
needs driver's knowledge.

>
> > + /* Fill out vectors at the beginning that don't need affinity */
> > + for (curvec = 0; curvec < affd->pre_vectors; curvec++)
> > + cpumask_copy(&masks[curvec].mask, cpu_possible_mask);
>
> cpu_possible_mask is wrong. Why are you deliberately trying to make this
> special? There is absolutely no reason to do so.

It is just for avoiding to export 'irq_default_affinity'.

>
> These interrupts are not managed and therefore the initial affinity has to
> be irq_default_affinity. Setting them to cpu_possible_mask can and probably
> will evade a well thought out default affinity mask, which was set to
> isolate a set of cores from general purpose interrupts.
>
> This is exactly the thing which happens with driver special stuff and which
> needs to be avoided. There is nothing special about this NVME setup and
> yes, I can see why the current core code is a bit tedious to work with, but
> that does not justify that extra driver magic by any means.

OK, thanks for your explanation.

Thanks,
Ming