Re: [PATCH 0/2] x86: Speed up ioremap operations

From: Mike Travis
Date: Wed Aug 27 2014 - 19:31:04 EST




On 8/27/2014 4:20 PM, Andrew Morton wrote:
> On Wed, 27 Aug 2014 16:15:28 -0700 Mike Travis <travis@xxxxxxx> wrote:
>
>>
>>>
>>>> There are two causes for requiring a restart/reload of the drivers.
>>>> First is periodic preventive maintenance (PM) and the second is if
>>>> any of the devices experience a fatal error. Both of these trigger
>>>> this excessively long delay in bringing the system back up to full
>>>> capability.
>>>>
>>>> The problem was tracked down to a very slow IOREMAP operation and
>>>> the excessively long ioresource lookup to insure that the user is
>>>> not attempting to ioremap RAM. These patches provide a speed up
>>>> to that function.
>>>
>>> With what result?
>>>
>>
>> Early measurements on our in house lab system (with far fewer cpus
>> and memory) shows about a 60-75% increase. They have a 31 devices,
>> 3000+ cpus, 10+Tb of memory. We have 20 devices, 480 cpus, ~2Tb of
>> memory. I expect their ioresource list to be about 5-10 times longer.
>> [But their system is in production so we have to wait for the next
>> scheduled PM interval before a live test can be done.]
>
> So you expect 1+ hours? That's still nuts.
>

Actually I expect a lot better improvement. We are removing cycles
through the I/O resource list and the longer the list, the longer
it takes to pass completely through it. As mentioned for a 128M
I/O BAR region, that is 32 passes, so we are removing 31 of them.
31 times a list 5-10 times longer should be a much better overall
improvement in the ioremap time. The startup time of the device
will still be there, though we are encouraging the vendor to look
at starting them up in parallel.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/