Re: [Linux Kernel Bugs] KASAN: slab-use-after-free Read in cec_queue_msg_fh and 4 other crashes in the cec device (`cec_ioctl`)

From: Hans Verkuil
Date: Tue Apr 30 2024 - 06:27:02 EST


Hi Chenyuan,

On 29/04/2024 17:42, Yang, Chenyuan wrote:
> Hi Hans,
>
> Here is my QEMU booting command:
>
> ```
> qemu-system-x86_64 \
> -m 2G \
> -smp 2 \
> -kernel linux/arch/x86/boot/bzImage \
> -append "console=ttyS0 root=/dev/sda earlyprintk=serial net.ifnames=0" \
> -drive file=image/bullseye-qemu.img,format=raw \
> -net user,host=10.0.2.10,hostfwd=tcp:127.0.0.1:10021-:22 \
> -net nic,model=e1000 \
> -enable-kvm \
> -nographic \
> -pidfile vm.pi
> ```
>
> Plus, here is the config of vivid from my linux kernel building config:
>
> ```
> CONFIG_VIDEO_VIVID=y
> CONFIG_VIDEO_VIVID_CEC=y
> CONFIG_VIDEO_VIVID_MAX_DEVS=64
>
> CONFIG_CMDLINE="... vivid.n_devs=16 vivid.multiplanar=1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2 ..."
>
> # Here is the full
> CONFIG_CMDLINE="earlyprintk=serial net.ifnames=0 sysctl.kernel.hung_task_all_cpu_backtrace=1 ima_policy=tcb nf-conntrack-ftp.ports=20000 nf-conntrack-tftp.ports=20000 nf-conntrack-sip.ports=20000 nf-conntrack-irc.ports=20000 nf-conntrack-sane.ports=20000 binder.debug_mask=0 rcupdate.rcu_expedited=1 rcupdate.rcu_cpu_stall_cputime=1 no_hash_pointers page_owner=on sysctl.vm.nr_hugepages=4 sysctl.vm.nr_overcommit_hugepages=4 secretmem.enable=1 sysctl.max_rcu_stall_to_panic=1 msr.allow_writes=off coredump_filter=0xffff root=/dev/sda console=ttyS0 vsyscall=native numa=fake=2 kvm-intel.nested=1 spec_store_bypass_disable=prctl nopcid vivid.n_devs=16 vivid.multiplanar=1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2 netrom.nr_ndevs=16 rose.rose_ndevs=16 smp.csd_lock_timeout=100000 watchdog_thresh=55 workqueue.watchdog_thresh=140 sysctl.net.core.netdev_unregister_timeout_secs=140 dummy_hcd.num=8 kmsan.panic=1"
> ```
>
> Plus, I attached the current patch (git diff > patch).
>
> Let me know if you need any further information.

Ah, that was helpful. I also discovered that I had to run the reproducer as root,
that helped too :-)

Please add this patch on top of the diff you already have and try again.

The reproducer kills each forked process after 5 seconds from what I can
tell. That causes a driver wait call to return -ERESTARTSYS, and that corner
case was not handled correctly. It caused the harmless but confusing
"transmit timed out" message.

Regards,

Hans

---------------------------------------------------------------------
[PATCH] cec: core: avoid confusing "transmit timed out" message

If, when waiting for a transmit to finish, the wait is interrupted,
then you might get a "transmit timed out" message, even though the
transmit was interrupted and did not actually time out.

Set transmit_in_progress_aborted to true if the
wait_for_completion_killable() call was interrupted and ensure
that the transmit is properly marked as ABORTED.

Signed-off-by: Hans Verkuil <hverkuil-cisco@xxxxxxxxx>
---
drivers/media/cec/core/cec-adap.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/media/cec/core/cec-adap.c b/drivers/media/cec/core/cec-adap.c
index a493cbce24567..da09834990b87 100644
--- a/drivers/media/cec/core/cec-adap.c
+++ b/drivers/media/cec/core/cec-adap.c
@@ -490,6 +490,15 @@ int cec_thread_func(void *_adap)
goto unlock;
}

+ if (adap->transmit_in_progress &&
+ adap->transmit_in_progress_aborted) {
+ if (adap->transmitting)
+ cec_data_cancel(adap->transmitting,
+ CEC_TX_STATUS_ABORTED, 0);
+ adap->transmit_in_progress = false;
+ adap->transmit_in_progress_aborted = false;
+ goto unlock;
+ }
if (adap->transmit_in_progress && timeout) {
/*
* If we timeout, then log that. Normally this does
@@ -771,6 +780,7 @@ int cec_transmit_msg_fh(struct cec_adapter *adap, struct cec_msg *msg,
{
struct cec_data *data;
bool is_raw = msg_is_raw(msg);
+ int err;

if (adap->devnode.unregistered)
return -ENODEV;
@@ -935,10 +945,13 @@ int cec_transmit_msg_fh(struct cec_adapter *adap, struct cec_msg *msg,
* Release the lock and wait, retake the lock afterwards.
*/
mutex_unlock(&adap->lock);
- wait_for_completion_killable(&data->c);
+ err = wait_for_completion_killable(&data->c);
cancel_delayed_work_sync(&data->work);
mutex_lock(&adap->lock);

+ if (err)
+ adap->transmit_in_progress_aborted = true;
+
/* Cancel the transmit if it was interrupted */
if (!data->completed) {
if (data->msg.tx_status & CEC_TX_STATUS_OK)
--
2.43.0