[PATCH] firmware: fix NULL pointer dereference in __fw_load_abort()

From: linux-kernel-dev
Date: Wed Jan 04 2017 - 12:40:20 EST


From: Patrick Bruenn <p.bruenn@xxxxxxxxxxxx>

Since commit 5d47ec02c37ea632398cb251c884e3a488dff794
("firmware: Correct handling of fw_state_wait() return value")

I see the following NULL pointer dereference:
[ 0.000000] Booting Linux on physical CPU 0x0
[ 0.000000] Linux version 4.10.0-rc2-CX9020-10+ (patrickbr@lbs1) (gcc version 5.4.0 20160609 (Debian 5.4.0-6) ) #23 PREEMPT Wed Jan 4 08:10:24 CET 2017
[ 0.000000] CPU: ARMv7 Processor [412fc085] revision 5 (ARMv7), cr=10c5387d
[ 0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
[ 0.000000] OF: fdt:Machine model: Freescale i.MX53 based Beckhoff CX9020
...
[ 3.098826] Unable to handle kernel NULL pointer dereference at virtual address 00000018
[ 3.115823] pgd = c0004000
[ 3.118632] [00000018] *pgd=00000000
[ 3.122279] Internal error: Oops: 17 [#1] PREEMPT ARM
[ 3.127406] Modules linked in: pwm_imx parallel_display panel_simple uio_pdrv_genirq uio
[ 3.135637] CPU: 0 PID: 26 Comm: kworker/0:1 Not tainted 4.10.0-rc2-CX9020-10+ #23
[ 3.143313] Hardware name: Freescale i.MX53 (Device Tree Support)
[ 3.149517] Workqueue: events request_firmware_work_func
[ 3.154908] task: dedbde00 task.stack: dee76000
[ 3.159510] PC is at _request_firmware+0x878/0x948
[ 3.164375] LR is at __mutex_lock_slowpath+0x2a8/0x340
[ 3.169576] pc : [<c04ffd80>] lr : [<c073f5e8>] psr: 60030013
sp : dee77e68 ip : dee77e20 fp : dee77ed4
[ 3.181183] r10: dee56a08 r9 : ded92410 r8 : dee56a00
[ 3.186470] r7 : dee77ee4 r6 : fffffffe r5 : def41680 r4 : deeb1e00
[ 3.193070] r3 : 00000000 r2 : 00000000 r1 : 00000000 r0 : 00000000
[ 3.199673] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
[ 3.206891] Control: 10c5387d Table: 8e1c8019 DAC: 00000051
[ 3.212704] Process kworker/0:1 (pid: 26, stack limit = 0xdee76210)
[ 3.219044] Stack: (0xdee77e68 to 0xdee78000)
[ 3.223458] 7e60: 7fffffff 00000000 00000001 c0102e2c c0c7a088 c0c735f8
[ 3.231731] 7e80: c0c5124c 00000002 00000001 7fffffff c0c04384 c0d1ae30 00001770 dec9d000
[ 3.240004] 7ea0: dedbde00 c0c12708 00000000 def41580 dee55680 c0c0ff68 df263100 00000000
[ 3.248290] 7ec0: 00000000 c0c6daf8 dee77efc dee77ed8 c04fffbc c04ff514 00000000 00000007
[ 3.256565] 7ee0: 0c0e0832 c01413dc def41580 def41580 dee77f34 dee77f00 c0141420 c04fff8c
[ 3.264841] 7f00: c0c0ff7c dee76038 dee58558 c0c0ff68 dee55698 c0c17e00 c0c0ff7c dee76038
[ 3.273115] 7f20: 00000008 dee55680 dee77f7c dee77f38 c014186c c01412f0 dee77f54 dee76000
[ 3.281387] 7f40: 00000000 dee53cc0 c0c17e00 c0c217a1 dee58558 dee58540 00000000 dee53cc0
[ 3.289667] 7f60: dee55680 c0141810 dee58558 ded0be94 dee77fac dee77f80 c0147dfc c014181c
[ 3.297945] 7f80: dee76000 dee53cc0 c0147ce8 00000000 00000000 00000000 00000000 00000000
[ 3.306217] 7fa0: 00000000 dee77fb0 c0108e18 c0147cf4 00000000 00000000 00000000 00000000
[ 3.314489] 7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 3.322762] 7fe0: 00000000 00000000 00000000 00000000 00000013 00000000 ffffffff ffffffff
[ 3.331055] [<c04ffd80>] (_request_firmware) from [<c04fffbc>] (request_firmware_work_func+0x3c/0x74)
[ 3.343535] [<c04fffbc>] (request_firmware_work_func) from [<c0141420>] (process_one_work+0x13c/0x52c)
[ 3.356057] [<c0141420>] (process_one_work) from [<c014186c>] (worker_thread+0x5c/0x620)
[ 3.367393] [<c014186c>] (worker_thread) from [<c0147dfc>] (kthread+0x114/0x144)
[ 3.378045] [<c0147dfc>] (kthread) from [<c0108e18>] (ret_from_fork+0x14/0x3c)
[ 3.388481] Code: eafffdf5 e59f00bc eb08fe40 e5980100 (e5903018)
[ 3.682255] ---[ end trace dbbc5ea21820dd99 ]---

I believe I tracked it down to:
drivers/dma/imx-sdma.c request_firmware()
-> fw_load_from_user_helper()
-> _request_firmware_load()
-> call fw_state_wait_timeout()

Some time later firmware_loading_store() scans a control value of "-1"
-> switch(loading) case -1: will call
-> fw_load_abort(fw_priv) which calls
-> __fw_load_abort(fw_priv->buf)
-> and set fw_priv->buf = NULL;

back in _request_firmware_load()
fw_state_wait_timeout() returns -ENOENT
-> since mentioned commit
-> fw_load_abort(fw_priv) is called a second time
-> and this time it would call:
-> __fw_load_abort(NULL /* fw_priv->buf */)
-> and we get: NULL->fw_st.status which fits 0x18:
offsetof(struct firmware_buf, fw_st) + offsetof(struct fw_state, status)

As a novice, I have no idea how to fix the original patch in a clean way.
My workaround is to just check buf in fw_load_abort() or __fw_load_abort()
Maybe I am totally wrong and misconfigured my board.
Feedback is very much appreciated.

Signed-off-by: Patrick Bruenn <p.bruenn@xxxxxxxxxxxx>
---
drivers/base/firmware_class.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
index 4497d263209f..3960ee117453 100644
--- a/drivers/base/firmware_class.c
+++ b/drivers/base/firmware_class.c
@@ -542,6 +542,8 @@ static struct firmware_priv *to_firmware_priv(struct device *dev)

static void __fw_load_abort(struct firmware_buf *buf)
{
+ if (!buf)
+ return;
/*
* There is a small window in which user can write to 'loading'
* between loading done and disappearance of 'loading'
--
2.11.0