Re: [driver-core PATCH v6 9/9] libnvdimm: Schedule device registration on node local to the device

From: Dan Williams
Date: Tue Nov 27 2018 - 15:50:52 EST


On Tue, Nov 27, 2018 at 12:33 PM Bart Van Assche <bvanassche@xxxxxxx> wrote:
>
> On Tue, 2018-11-27 at 11:34 -0800, Dan Williams wrote:
> > On Tue, Nov 27, 2018 at 10:04 AM Alexander Duyck
> > <alexander.h.duyck@xxxxxxxxxxxxxxx> wrote:
> > >
> > > On Mon, 2018-11-26 at 18:21 -0800, Dan Williams wrote:
> > > > On Thu, Nov 8, 2018 at 10:07 AM Alexander Duyck
> > > > <alexander.h.duyck@xxxxxxxxxxxxxxx> wrote:
> > > > >
> > > > > Force the device registration for nvdimm devices to be closer to the actual
> > > > > device. This is achieved by using either the NUMA node ID of the region, or
> > > > > of the parent. By doing this we can have everything above the region based
> > > > > on the region, and everything below the region based on the nvdimm bus.
> > > > >
> > > > > By guaranteeing NUMA locality I see an improvement of as high as 25% for
> > > > > per-node init of a system with 12TB of persistent memory.
> > > > >
> > > >
> > > > It seems the speed-up is achieved with just patches 1, 2, and 9 from
> > > > this series, correct? I wouldn't want to hold up that benefit while
> > > > the driver-core bits are debated.
> > >
> > > Actually patch 6 ends up impacting things for persistent memory as
> > > well. The problem is that all the async calls to add interfaces only do
> > > anything if the driver is already loaded. So there are cases such as
> > > the X86_PMEM_LEGACY_DEVICE case where the memory regions end up still
> > > being serialized because the devices are added before the driver.
> >
> > Ok, but is the patch 6 change generally useful outside of the
> > libnvdimm case? Yes, local hacks like MODULE_SOFTDEP are terrible for
> > global problems, but what I'm trying to tease out if this change
> > benefits other async probing subsystems outside of libnvdimm, SCSI
> > perhaps? Bart can you chime in with the benefits you see so it's clear
> > to Greg that the driver-core changes are a generic improvement?
>
> Hi Dan,
>
> For SCSI asynchronous probing is really important because when scanning SAN
> LUNs there is plenty of potential for concurrency due to the network delay.
>
> I think the following quote provides the information you are looking for:
>
> "This patch reduces the time needed for loading the scsi_debug kernel
> module with parameters delay=0 and max_luns=256 from 0.7s to 0.1s. In
> other words, this specific test runs about seven times faster."
>
> Source: https://www.spinics.net/lists/linux-scsi/msg124457.html

Thanks Bart, so tying this back to Alex's patches, does the ordering
problem that Alex's patches solve impact the SCSI case? I'm looking
for something like "SCSI depends on asynchronous probing and without
'driver core: Establish clear order of operations for deferred probe
and remove' probing is often needlessly serialized". I.e. does it
suffer from the same platform problem that libnvdimm ran into where
it's local async probing implementation was hindered by the driver
core?