Re: [PATCH] Parallel microcode update in Linux

From: Mihai Carabas
Date: Mon Sep 02 2019 - 03:27:33 EST




> On 1 Sep 2019, at 20:25, Pavel Machek <pavel@xxxxxx> wrote:
>
> Hi!
>
>> + u64 p0, p1;
>> int ret;
>>
>> atomic_set(&late_cpus_in, 0);
>> atomic_set(&late_cpus_out, 0);
>>
>> + p0 = rdtsc_ordered();
>> +
>> ret = stop_machine_cpuslocked(__reload_late, NULL, cpu_online_mask);
>> +
>> + p1 = rdtsc_ordered();
>> +
>> if (ret > 0)
>> microcode_check();
>>
>> pr_info("Reload completed, microcode revision: 0x%x\n", boot_cpu_data.microcode);
>>
>> + pr_info("p0: %lld, p1: %lld, diff: %lld\n", p0, p1, p1 - p0);
>> +
>> return ret;
>> }
>>
>> We have used a machine with a broken microcode in BIOS and no microcode in
>> initramfs (to bypass early loading).
>>
>> Here are the results for parallel loading (we made two measurements):
>
>> [ 18.197760] microcode: updated to revision 0x200005e, date = 2019-04-02
>> [ 18.201225] x86/CPU: CPU features have changed after loading microcode, but might not take effect.
>> [ 18.201230] microcode: Reload completed, microcode revision: 0x200005e
>> [ 18.201232] microcode: p0: 118138123843052, p1: 118138153732656, diff: 29889604
>
>> Here are the results of serial loading:
>>
>> [ 17.542518] microcode: updated to revision 0x200005e, date = 2019-04-02
>> [ 17.898365] x86/CPU: CPU features have changed after loading microcode, but might not take effect.
>> [ 17.898370] microcode: Reload completed, microcode revision: 0x200005e
>> [ 17.898372] microcode: p0: 149220216047388, p1: 149221058945422, diff: 842898034
>>
>> One can see that the difference is an order magnitude.
>
> Well, that's impressive, but it seems to finish 300 msec later? Where does that difference
> come from / how much real time do you gain by this?

The difference comes from the large amount of cores/threads the machine has: 72 in this case, but there are machines with more. As the commit message says initially the microcode was applied serially one by one and now the microcode is updated in parallel on all cores.

300ms seems nothing but it is enough to cause disruption in some critical services (e.g. storage) - 300ms in which we do not execute anything on CPUs. Also this 300ms is increasing when the machine is fully loaded with guests.

Thanks,
Mihai

>
> Best regards,
> Pavel
>
> --
> (english) https://urldefense.proofpoint.com/v2/url?u=http-3A__www.livejournal.com_-7Epavelmachek&d=DwIBAg&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=IOMTUEJr06tE0LeEzvwr_907ba6u9S5iDf7M8ZYjbGY&m=cz26YweqnHS4QvZBi-1jNR8t7o3n04-8UsSBZqEQHgA&s=-nEQbDyJrDjKxyrt496frey_aMJHXmgMcm-hH0ewO7M&e=
> (cesky, pictures) https://urldefense.proofpoint.com/v2/url?u=http-3A__atrey.karlin.mff.cuni.cz_-7Epavel_picture_horses_blog.html&d=DwIBAg&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=IOMTUEJr06tE0LeEzvwr_907ba6u9S5iDf7M8ZYjbGY&m=cz26YweqnHS4QvZBi-1jNR8t7o3n04-8UsSBZqEQHgA&s=0L72IdzqTDn_8PmDVcNxLAFbcYG1jRDN9ob8SZ18XTE&e=