Re: [PATCH] ALSA: hda: Request driver probe from an async task

From: Takashi Iwai
Date: Tue Apr 24 2018 - 08:15:19 EST


On Tue, 24 Apr 2018 13:59:58 +0200,
Paul Menzel wrote:
>
> Dear Takashi,
>
>
> On 04/23/18 14:33, Takashi Iwai wrote:
> > On Mon, 23 Apr 2018 14:30:36 +0200, Paul Menzel wrote:
>
> >> On 04/23/18 14:21, Takashi Iwai wrote:
> >>> On Mon, 23 Apr 2018 14:05:52 +0200, Paul Menzel wrote:
> >>>>
> >>>> From: Paul Menzel <pmenzel@xxxxxxxxxxxxx>
> >>>> Date: Sat, 24 Mar 2018 09:28:43 +0100
> >>>>
> >>>> On an ASRock E350M1, with Linux 4.17-rc1 according to `initcall_debug`
> >>>> calling `azx_driver_init` takes sometimes more than a few milliseconds,
> >>>> and up to 200 ms.
> >>>>
> >>>> ```
> >>>> [ 2.892598] calling azx_driver_init+0x0/0xfe4 [snd_hda_intel] @ 218
> >>>> [ 2.943002] initcall azx_driver_init+0x0/0xfe4 [snd_hda_intel]
> >>>> returned 0 after 49195 usecs
> >>>> ```
> >>>>
> >>>> Trying to execute the Linux kernel in less than 500 ms, this is quite a
> >>>> hold-up, and therefore request the probe from an async task.
> >>>>
> >>>> With this change, the test shows, that the function returns earlier.
> >>>>
> >>>> ```
> >>>> [ 3.254800] calling azx_driver_init+0x0/0xfe4 [snd_hda_intel] @ 227
> >>>> [ 3.254887] initcall azx_driver_init+0x0/0xfe4 [snd_hda_intel]
> >>>> returned 0 after 66 usecs
> >>>> ```
> >>>>
> >>>> The same behavior is visible on a Dell OptiPlex 7010. The longer times
> >>>> seem to happen, when the module *e1000e* is probed during the same time.
> >>>>
> >>>> Signed-off-by: Paul Menzel <pmenzel@xxxxxxxxxxxxx>
> >>>
> >>> What actually took so long? Could you analyze further instead of
> >>> blindly putting the flag?
> >>
> >> Well, I am not sure. Could you please give me hints, how to debug this
> >> further? Is there some debug flag?
> >
> > Usually perf would help, but even a simple printk() should suffice to
> > see what's going on there :)
>
> Please find the messages for a 23 ms run below, and the debug patch
> attached.
>
> > [ 2.996238] calling azx_driver_init+0x0/0xfe4 [snd_hda_intel] @ 214
> > [ 3.009838] calling rtl8169_pci_driver_init+0x0/0x1000 [r8169] @ 217
> > [ 3.009904] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
> > [ 3.010721] random: get_random_bytes called from ipv6_regen_rndid+0x2c/0x100 with crng_init=1
> > [ 3.011283] r8169 0000:03:00.0 eth0: RTL8168e/8111e at 0x(ptrval), bc:5f:f4:c8:d3:98, XID 0c200000 IRQ 26
> > [ 3.011289] r8169 0000:03:00.0 eth0: jumbo features [frames: 9200 bytes, tx checksumming: ko]
> > [ 3.013876] initcall rtl8169_pci_driver_init+0x0/0x1000 [r8169] returned 0 after 3917 usecs
> > [ 3.031754] calling pcspkr_platform_driver_init+0x0/0x1000 [pcspkr] @ 221
> > [ 3.031904] input: PC Speaker as /devices/platform/pcspkr/input/input4
> > [ 3.032288] initcall pcspkr_platform_driver_init+0x0/0x1000 [pcspkr] returned 0 after 508 usecs
> > [ 3.034795] calling psmouse_init+0x0/0x7c [psmouse] @ 220
> > [ 3.034903] initcall psmouse_init+0x0/0x7c [psmouse] returned 0 after 87 usecs
> > [ 3.043051] random: get_random_u32 called from cache_random_seq_create+0xa3/0x1f0 with crng_init=1
> > [ 3.043154] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=1
> > [ 3.043187] random: get_random_u32 called from cache_random_seq_create+0xa3/0x1f0 with crng_init=1
> > [ 3.043198] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=1
> > [ 3.043229] random: get_random_u32 called from cache_random_seq_create+0xa3/0x1f0 with crng_init=1
> > [ 3.043265] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=1
> > [ 3.043429] random: get_random_u32 called from cache_random_seq_create+0xa3/0x1f0 with crng_init=1
> > [ 3.043479] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=1
> > [ 3.043749] random: get_random_bytes called from key_alloc+0x1fc/0x5e0 with crng_init=1
> > [ 3.043973] random: get_random_u32 called from cache_random_seq_create+0xa3/0x1f0 with crng_init=1
> > [ 3.045483] random: get_random_u32 called from arch_rnd.part.2+0x18/0x40 with crng_init=1
> > [ 3.045501] random: get_random_u32 called from load_elf_binary+0x76a/0x1d20 with crng_init=1
> > [ 3.045507] random: get_random_u32 called from arch_align_stack+0x45/0x70 with crng_init=1
> > [ 3.045528] random: get_random_u32 called from arch_rnd.part.2+0x18/0x40 with crng_init=1
> > [ 3.045558] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=1
> > [ 3.045771] random: get_random_u32 called from cache_random_seq_create+0xa3/0x1f0 with crng_init=1
> > [ 3.046127] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=1
> > [ 3.046197] calling pcsp_init+0x0/0x1000 [snd_pcsp] @ 221
> > [ 3.046206] Error: Driver 'pcspkr' is already registered, aborting...
> > [ 3.046337] initcall pcsp_init+0x0/0x1000 [snd_pcsp] returned -16 after 128 usecs
> > [ 3.046348] random: get_random_u32 called from arch_align_stack+0x45/0x70 with crng_init=1
> > [ 3.046363] random: get_random_bytes called from load_elf_binary+0xaf8/0x1d20 with crng_init=1
> > [ 3.046383] random: get_random_u32 called from randomize_page+0x42/0x57 with crng_init=1
> > [ 3.047312] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=1
> > [ 3.048151] random: get_random_u32 called from cache_random_seq_create+0xa3/0x1f0 with crng_init=1
> > [ 3.048638] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=1
> > [ 3.054325] random: get_random_u32 called from copy_process.part.33+0x186/0x2100 with crng_init=1
> > [ 3.054371] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=1
> > [ 3.067344] usb usb3: New USB device found, idVendor=1d6b, idProduct=0001, bcdDevice= 4.17
> > [ 3.067350] usb usb3: New USB device strings: Mfr=3, Product=2, SerialNumber=1
> > [ 3.067353] usb usb3: Product: OHCI PCI host controller
> > [ 3.067356] usb usb3: Manufacturer: Linux 4.17.0-rc2+ ohci_hcd
> > [ 3.067359] usb usb3: SerialNumber: 0000:00:12.0
> > [ 3.067850] random: get_random_u32 called from cache_random_seq_create+0xa3/0x1f0 with crng_init=1
> > [ 3.067864] hub 3-0:1.0: USB hub found
> > [ 3.067960] hub 3-0:1.0: 5 ports detected
> > [ 3.068141] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=1
> > [ 3.068618] 1
> > [ 3.068644] 2
> > [ 3.068953] 3
> > [ 3.068954] 4
> > [ 3.068955] 5
> > [ 3.068959] 6
> > [ 3.068960] 7
> > [ 3.068966] 8
> > [ 3.068967] 9
> > [ 3.069204] 1
> > [ 3.069217] 2
> > [ 3.069565] 3
> > [ 3.069566] 4
> > [ 3.069567] 5
> > [ 3.069568] 6
> > [ 3.069569] 7
> > [ 3.069581] 8
> > [ 3.069583] 9
> > [ 3.069774] random: get_random_u32 called from copy_process.part.33+0x186/0x2100 with crng_init=1
> > [ 3.069924] initcall azx_driver_init+0x0/0xfe4 [snd_hda_intel] returned 0 after 23153 usecs
>
> Unfortunately, it looks like the time stamps of the Linux kernel and
> the initcall_debug time do not match. From the Linux time stamps it
> should be 73 ms.

Indeed. But even from this result, you can have a rough idea.
As you can see, the most of time was spent before "1" point, which is
the very beginning of azx_probe(). That is, the slowness is not in
HD-audio driver probe itself. Rather it's likely because of parallel
probing with other multiple devices.


thanks,

Takashi

>
> >> I am only aware of the Ftrace framework, but in my experience it also
> >> skews the timings quite a bit, so might not be the best choice.
> >
> > We know that there are some cases where the codec / controller
> > communication stalls on the recent Coffee Lake or such platforms.
> > But quite not sure how it happens.
> >
> > Moving the stuff into async just moves something ugly, and it's no
> > fix, per se, if such a long delay itself is unexpected.
>
> Letâs hope, we can fix it.
>
>
> Kind regards,
>
> Paul