On Fri, Aug 27, 2021 at 7:38 AM Kefeng Wang <wangkefeng.wang@xxxxxxxxxx> wrote:
I'm assuming it's this one?
On 2021/8/27 8:04, Saravana Kannan wrote:
On Thu, Aug 26, 2021 at 1:22 AM Kefeng Wang <wangkefeng.wang@xxxxxxxxxx> wrote:
Hi Saravana, I try the link[1], but with it, there is a crash when bootBtw, I've been working on [1] cleaning up the one-off deferred probe
solution that we have for amba devices. That causes a bunch of other
headaches. Your patch 3/3 takes us further in the wrong direction by
adding more reasons for delaying the addition of the device.
(qemu-system-arm -M vexpress-a15),
arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
Yes, I am confused too.
Sorry, I haven't ever setup qemu and booted vexpress. Thanks for your help.Hi,Can you reproduce it? it is very likely related(without your patch, the
It's hard to make sense of the logs. Looks like two different threads
might be printing to the log at the same time? Can you please enable
the config that prints the thread ID (forgot what it's called) and
collect this again? With what I could tell the crash seems to be
happening somewhere in platform_match(), but that's not related to
this patch at all?
boot is fine),
the NULL ptr is about serio, it is registed from amba driver.Thanks for the pointer. I took a look at the logs and the code. It's
ambakmi_driver_init
-- amba_kmi_probe
-- __serio_register_port
very strange. As you can see from the backtrace, platform_match() is
being called for the device_add() from serio_handle_event(). But the
device that gets added there is on the serio_bus which obviously
should be using the serio_bus_match.
...
+Dmitry and input maillist, is there some known issue about serio ?Based on the logs you added, it's pretty clear we are getting to
I add some debug, the full log is attached.
[ 2.958355][ T41] input: AT Raw Set 2 keyboard as
/devices/platform/bus@8000000/bus@8000000:motherboard-bus/bus@8000000:motherboard-bus:iofpga-bus@300000000/1c060000.kmi/serio0/input/input0
[ 2.977441][ T41] serio serio1: pdev c1e05508, pdev->name (null),
drv c1090fc0, drv->name vexpress-reset
platform_match(). It's also strange that the drv->name is
vexpress-reset
[ 3.003113][ T41] Backtrace:But the platform_match() is happening for the device_add() from
[ 3.003451][ T41] [<c0560bb4>] (strcmp) from [<c0646358>] (platform_match+0xdc/0xf0)
[ 3.003963][ T41] [<c064627c>] (platform_match) from [<c06437d4>] (__device_attach_driver+0x3c/0xf4)
[ 3.004769][ T41] [<c0643798>] (__device_attach_driver) from [<c0641180>] (bus_for_each_drv+0x68/0xc8)
[ 3.005481][ T41] [<c0641118>] (bus_for_each_drv) from [<c0642f40>] (__device_attach+0xf0/0x16c)
[ 3.006152][ T41] [<c0642e50>] (__device_attach) from [<c06439d4>] (device_initial_probe+0x1c/0x20)
[ 3.006853][ T41] [<c06439b8>] (device_initial_probe) from [<c0642030>] (bus_probe_device+0x94/0x9c)
[ 3.007259][ T41] [<c0641f9c>] (bus_probe_device) from [<c063f9cc>] (device_add+0x408/0x8b8)
[ 3.007900][ T41] [<c063f5c4>] (device_add) from [<c071c1cc>] (serio_handle_event+0x1b8/0x234)
[ 3.008824][ T41] [<c071c014>] (serio_handle_event) from [<c01475a4>] (process_one_work+0x238/0x594)
[ 3.009737][ T41] [<c014736c>] (process_one_work) from [<c014795c>] (worker_thread+0x5c/0x5f4)
[ 3.010638][ T41] [<c0147900>] (worker_thread) from [<c014feb4>] (kthread+0x178/0x194)
[ 3.011496][ T41] [<c014fd3c>] (kthread) from [<c0100150>] (ret_from_fork+0x14/0x24)
[ 3.011860][ T41] Exception stack(0xc1675fb0 to 0xc1675ff8)
serio_event_handle() that's adding a device to the serio_bus and it
should be using serio_bus_match().
I haven't reached any conclusion yet, but my current thought process
is that it's either:
1. My patch is somehow causing list corruption. But I don't directly
touch any list in my change (other than deleting a list entirely), so
it's not clear how that would be happening.
2. Without my patch, these AMBA device's probe would be delayed at
least until 5 seconds or possibly later. I'm wondering if my patch is
catching some bad timing assumptions in other code.
ok, I will try this one, but due to above patch, it may not work.
You might be able to test out theory (2) by DEFERRED_DEVICE_TIMEOUT to
a much smaller number. Say 500ms or 100ms. If it doesn't crash, it
doesn't mean it's not (2), but if it does, then we know for sure it's
(2).
I'll continue debugging further.
-Saravana