Reversing the call order didn't help, I am still getting the trace.
On 08/27/2018 01:07 AM, George Cherian wrote:
Hi Corey,
On 08/24/2018 06:37 PM, Corey Minyard wrote:
On 08/24/2018 06:10 AM, George Cherian wrote:
In ssif_probe error path the i2c client is left hanging, so that
ssif_platform_remove will remove the client. But it is quite
possible that ssif would never call an i2c_new_device.
This condition would lead to kernel crash as below.
To fix this leave only the client ssif registered hanging in error
path. All other non-registered clients are set to NULL.
I'm having a hard time seeing how this could happen.
The i2c_new_device() call is only done in the case of dmi_ipmi_probe
(called from
ssif_platform_probe) or a hard-coded entry. How does
ssif_platform_remove get
called on a device that was not registered with ssif_platform_probe?
Initially I also had the same doubt but then,
ssif_adapter_hanlder is called for each i2c_dev only after initialized
is true. So we end up not calling i2c_new_device for devices probed
during the module_init time.
I spent some time looking at this, and I don't think that's what is
happening.
I think that i2c_del_driver() in cleanup_ipmi_ssif() is causing
i2c_unregister_device() to be called on all the devices, and
platform_driver_unregister() causes it to be called on the
devices that came in through the platform method. It's
a double-free.
Try reversing the order of i2c_del_driver() and platform_driver_unregister()
in cleanup_ipmi_ssif() to test this.
-corey
ssif_platform_remove() get called during removal of ipmi_ssif.
In case during ssif_probe() if there is a failure before
ipmi_smi_register then we leave the addr_info->client hanging.
In case of normal functioning without error, we set addr_info->client
to NULL after ipmi_unregiter_smi in ssif_remove.
Small style comment inline...I will make the changess and sent out a v2!!
Thanks,
-George
 CPU: 107 PID: 30266 Comm: rmmod Not tainted 4.18.0+ #80
 Hardware name: Cavium Inc. Saber/Saber, BIOS Cavium reference
firmware version 7.0 08/04/2018
 pstate: 60400009 (nZCv daif +PAN -UAO)
 pc : kernfs_find_ns+0x28/0x120
 lr : kernfs_find_and_get_ns+0x40/0x60
 sp : ffff00002310fb50
 x29: ffff00002310fb50 x28: ffff800a8240f800
 x27: 0000000000000000 x26: 0000000000000000
 x25: 0000000056000000 x24: ffff000009073000
 x23: ffff000008998b38 x22: 0000000000000000
 x21: ffff800ed86de820 x20: 0000000000000000
 x19: ffff00000913a1d8 x18: 0000000000000000
 x17: 0000000000000000 x16: 0000000000000000
 x15: 0000000000000000 x14: 5300737265766972
 x13: 643d4d4554535953 x12: 0000000000000030
 x11: 0000000000000030 x10: 0101010101010101
 x9 : ffff800ea06cc3f9 x8 : 0000000000000000
 x7 : 0000000000000141 x6 : ffff000009073000
 x5 : ffff800adb706b00 x4 : 0000000000000000
 x3 : 00000000ffffffff x2 : 0000000000000000
 x1 : ffff000008998b38 x0 : ffff000008356760
 Process rmmod (pid: 30266, stack limit = 0x00000000e218418d)
 Call trace:
ÂÂ kernfs_find_ns+0x28/0x120
ÂÂ kernfs_find_and_get_ns+0x40/0x60
ÂÂ sysfs_unmerge_group+0x2c/0x6c
ÂÂ dpm_sysfs_remove+0x34/0x70
ÂÂ device_del+0x58/0x30c
ÂÂ device_unregister+0x30/0x7c
ÂÂ i2c_unregister_device+0x84/0x90 [i2c_core]
ÂÂ ssif_platform_remove+0x38/0x98 [ipmi_ssif]
ÂÂ platform_drv_remove+0x2c/0x6c
ÂÂ device_release_driver_internal+0x168/0x1f8
ÂÂ driver_detach+0x50/0xbc
ÂÂ bus_remove_driver+0x74/0xe8
ÂÂ driver_unregister+0x34/0x5c
ÂÂ platform_driver_unregister+0x20/0x2c
ÂÂ cleanup_ipmi_ssif+0x50/0xd82c [ipmi_ssif]
ÂÂ __arm64_sys_delete_module+0x1b4/0x220
ÂÂ el0_svc_handler+0x104/0x160
ÂÂ el0_svc+0x8/0xc
 Code: aa1e03e0 aa0203f6 aa0103f7 d503201f (7940e280)
 ---[ end trace 09f0e34cce8e2d8c ]---
 Kernel panic - not syncing: Fatal exception
 SMP: stopping secondary CPUs
 Kernel Offset: disabled
 CPU features: 0x23800c38
Signed-off-by: George Cherian <george.cherian@xxxxxxxxxx>
---
 drivers/char/ipmi/ipmi_ssif.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/char/ipmi/ipmi_ssif.c
b/drivers/char/ipmi/ipmi_ssif.c
index 18e4650..ccdf6b1 100644
--- a/drivers/char/ipmi/ipmi_ssif.c
+++ b/drivers/char/ipmi/ipmi_ssif.c
@@ -181,6 +181,7 @@ struct ssif_addr_info {
ÂÂÂÂÂ struct device *dev;
ÂÂÂÂÂ struct i2c_client *client;
+ÂÂÂÂ bool client_registered;
ÂÂÂÂÂ struct mutex clients_mutex;
ÂÂÂÂÂ struct list_head clients;
@@ -1658,6 +1659,8 @@ static int ssif_probe(struct i2c_client
*client, const struct i2c_device_id *id)
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ * the client like it should.
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ */
ÂÂÂÂÂÂÂÂÂÂÂÂÂ dev_err(&client->dev, "Unable to start IPMI SSIF:
%d\n", rv);
+ÂÂÂÂÂÂÂÂÂÂÂÂ if (!addr_info->client_registered)
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ addr_info->client = NULL;
ÂÂÂÂÂÂÂÂÂÂÂÂÂ kfree(ssif_info);
ÂÂÂÂÂ }
ÂÂÂÂÂ kfree(resp);
@@ -1672,11 +1675,14 @@ static int ssif_probe(struct i2c_client
*client, const struct i2c_device_id *id)
 static int ssif_adapter_handler(struct device *adev, void *opaque)
 {
ÂÂÂÂÂ struct ssif_addr_info *addr_info = opaque;
+ÂÂÂÂ struct i2c_client *client;
ÂÂÂÂÂ if (adev->type != &i2c_adapter_type)
ÂÂÂÂÂÂÂÂÂÂÂÂÂ return 0;
-ÂÂÂÂ i2c_new_device(to_i2c_adapter(adev), &addr_info->binfo);
+ÂÂÂÂ client = i2c_new_device(to_i2c_adapter(adev), &addr_info->binfo);
+ÂÂÂÂ if (client)
+ÂÂÂÂÂÂÂÂÂÂÂÂ addr_info->client_registered = true;
How about..
ÂÂÂ if (i2c_new_device(to_i2c_adapter(adev), &addr_info->binfo))
ÂÂÂÂÂÂÂ addr_info->client_registered = true;
No need for the client variable.
-corey
ÂÂÂÂÂ if (!addr_info->adapter_name)
ÂÂÂÂÂÂÂÂÂÂÂÂÂ return 1; /* Only try the first I2C adapter by
default. */