On Fri, Aug 27, 2021 at 6:09 PM Kefeng Wang <wangkefeng.wang@xxxxxxxxxx> wrote:I can't find any clue, and have no time to check this for now, is there any news from your side?
Were you able to find anything more?
On 2021/8/28 3:09, Saravana Kannan wrote:
On Fri, Aug 27, 2021 at 7:38 AM Kefeng Wang <wangkefeng.wang@xxxxxxxxxx> wrote:I use arch/arm/boot/dts/vexpress-v2p-ca15-tc1.dts.
On 2021/8/27 8:04, Saravana Kannan wrote:I'm assuming it's this one?
On Thu, Aug 26, 2021 at 1:22 AM Kefeng Wang <wangkefeng.wang@xxxxxxxxxx> wrote:
Hi Saravana, I try the link[1], but with it, there is a crash when bootBtw, I've been working on [1] cleaning up the one-off deferred probe
solution that we have for amba devices. That causes a bunch of other
headaches. Your patch 3/3 takes us further in the wrong direction by
adding more reasons for delaying the addition of the device.
(qemu-system-arm -M vexpress-a15),
arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
qemu-system-arm -M vexpress-a15 -dtb vexpress-v2p-ca15-tc1.dtb -cpu
cortex-a15 -smp 2 -m size=3G -kernel zImage -rtc base=localtime -initrd
initrd-arm32 -append 'console=ttyAMA0 cma=0 kfence.sample_interval=0
earlyprintk debug ' -device virtio-net-device,netdev=net8 -netdev
type=tap,id=net8,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown
-nographic
Yes, I am confused too.Sorry, I haven't ever setup qemu and booted vexpress. Thanks for your help.Hi,Can you reproduce it? it is very likely related(without your patch, the
It's hard to make sense of the logs. Looks like two different threads
might be printing to the log at the same time? Can you please enable
the config that prints the thread ID (forgot what it's called) and
collect this again? With what I could tell the crash seems to be
happening somewhere in platform_match(), but that's not related to
this patch at all?
boot is fine),
the NULL ptr is about serio, it is registed from amba driver.Thanks for the pointer. I took a look at the logs and the code. It's
ambakmi_driver_init
-- amba_kmi_probe
-- __serio_register_port
very strange. As you can see from the backtrace, platform_match() is
being called for the device_add() from serio_handle_event(). But the
device that gets added there is on the serio_bus which obviously
should be using the serio_bus_match.
...+Dmitry and input maillist, is there some known issue about serio ?Based on the logs you added, it's pretty clear we are getting to
I add some debug, the full log is attached.
[ 2.958355][ T41] input: AT Raw Set 2 keyboard as
/devices/platform/bus@8000000/bus@8000000:motherboard-bus/bus@8000000:motherboard-bus:iofpga-bus@300000000/1c060000.kmi/serio0/input/input0
[ 2.977441][ T41] serio serio1: pdev c1e05508, pdev->name (null),
drv c1090fc0, drv->name vexpress-reset
platform_match(). It's also strange that the drv->name is
vexpress-reset
Maybe some concurrent driver load?[ 3.003113][ T41] Backtrace:But the platform_match() is happening for the device_add() from
[ 3.003451][ T41] [<c0560bb4>] (strcmp) from [<c0646358>] (platform_match+0xdc/0xf0)
[ 3.003963][ T41] [<c064627c>] (platform_match) from [<c06437d4>] (__device_attach_driver+0x3c/0xf4)
[ 3.004769][ T41] [<c0643798>] (__device_attach_driver) from [<c0641180>] (bus_for_each_drv+0x68/0xc8)
[ 3.005481][ T41] [<c0641118>] (bus_for_each_drv) from [<c0642f40>] (__device_attach+0xf0/0x16c)
[ 3.006152][ T41] [<c0642e50>] (__device_attach) from [<c06439d4>] (device_initial_probe+0x1c/0x20)
[ 3.006853][ T41] [<c06439b8>] (device_initial_probe) from [<c0642030>] (bus_probe_device+0x94/0x9c)
[ 3.007259][ T41] [<c0641f9c>] (bus_probe_device) from [<c063f9cc>] (device_add+0x408/0x8b8)
[ 3.007900][ T41] [<c063f5c4>] (device_add) from [<c071c1cc>] (serio_handle_event+0x1b8/0x234)
[ 3.008824][ T41] [<c071c014>] (serio_handle_event) from [<c01475a4>] (process_one_work+0x238/0x594)
[ 3.009737][ T41] [<c014736c>] (process_one_work) from [<c014795c>] (worker_thread+0x5c/0x5f4)
[ 3.010638][ T41] [<c0147900>] (worker_thread) from [<c014feb4>] (kthread+0x178/0x194)
[ 3.011496][ T41] [<c014fd3c>] (kthread) from [<c0100150>] (ret_from_fork+0x14/0x24)
[ 3.011860][ T41] Exception stack(0xc1675fb0 to 0xc1675ff8)
serio_event_handle() that's adding a device to the serio_bus and it
should be using serio_bus_match().
I haven't reached any conclusion yet, but my current thought process
is that it's either:
1. My patch is somehow causing list corruption. But I don't directly
touch any list in my change (other than deleting a list entirely), so
it's not clear how that would be happening.
2. Without my patch, these AMBA device's probe would be delayed atAfter Rob's patch, It will retry soon.
least until 5 seconds or possibly later. I'm wondering if my patch is
catching some bad timing assumptions in other code.
commit 039599c92d3b2e73689e8b6e519d653fd4770abb
Author: Rob Herring <robh@xxxxxxxxxx>
Date: Wed Apr 29 15:58:12 2020 -0500
amba: Retry adding deferred devices at late_initcall
If amba bus devices defer when adding, the amba bus code simply retries
adding the devices every 5 seconds. This doesn't work well as it
completely unsynchronized with starting the init process which can
happen in less than 5 secs. Add a retry during late_initcall. If the
amba devices are added, then deferred probe takes over. If the
dependencies have not probed at this point, then there's no improvement
over previous behavior. To completely solve this, we'd need to retry
after every successful probe as deferred probe does.
The list_empty() check now happens outside the mutex, but the mutex
wasn't necessary in the first place.
This needed to use deferred probe instead of fragile initcall ordering
on 32-bit VExpress systems where the apb_pclk has a number of probe
dependencies (vexpress-sysregs, vexpress-config).
You might be able to test out theory (2) by DEFERRED_DEVICE_TIMEOUT took, I will try this one, but due to above patch, it may not work.
a much smaller number. Say 500ms or 100ms. If it doesn't crash, it
doesn't mean it's not (2), but if it does, then we know for sure it's
(2).
-Saravana
.