Re: [PATCH v2 1/2] remoteproc: Add character device interface

From: rishabhb
Date: Wed Apr 08 2020 - 14:35:08 EST


On 2020-04-06 08:58, ClÃment Leger wrote:
Hi Arnaud,

----- On 6 Apr, 2020, at 16:17, Arnaud Pouliquen arnaud.pouliquen@xxxxxx wrote:

Hi ClÃment,

On 4/6/20 2:06 PM, ClÃment Leger wrote:
Hi Arnaud,

----- On 6 Apr, 2020, at 11:01, Arnaud Pouliquen arnaud.pouliquen@xxxxxx wrote:

On 4/3/20 9:13 PM, rishabhb@xxxxxxxxxxxxxx wrote:
On 2020-04-02 10:28, Arnaud POULIQUEN wrote:
Hi

On 4/1/20 2:03 AM, Rishabh Bhatnagar wrote:
Add the character device interface for userspace applications.
This interface can be used in order to boot up and shutdown
remote subsystems. Currently there is only a sysfs interface
which the userspace clients can use. If a usersapce application
crashes after booting the remote processor does not get any
indication about the crash. It might still assume that the
application is running. For example modem uses remotefs service
to fetch data from disk/flash memory. If the remotefs service
crashes, modem keeps on requesting data which might lead to a
crash. Adding a character device interface makes the remote
processor tightly coupled with the user space application.
A crash of the application leads to a close on the file descriptors
therefore shutting down the remoteproc.

Sorry I'm late in the discussion, I hope I've gone through the whole
discussion so I don't reopen a closed point...

Something here is not crystal clear to me so I'd rather share it...

I suppose that you the automatic restart of the application is not possible to
stop and restart the remote processor...
Yes correct, while we wait for the application to restart we might observe a
fatal crash.

Why this use case can not be solved by a process monitor or a service
in userland that detects the application crash and stop the remote
firmware using
the sysfs interface?

What happens in the case where the process monitor itself crashes? This is
actually the approach we follow in our downstream code. We have a central entity
in userspace that controls bootup/shutdown of some remote processors based on
the
votes from userspace clients. We have observed cases where this entity
itself crashes and remote processors are left hanging.

Your description makes me feel like this patch is only a workaround of something
that
should be fixed in the userland, even if i understand that hanging is one of the
most
critical problem and have to be fixed.
For instance, how to handle several applications that interact with the remote
processor
( e.g. rpmsg service applications) how to stop and restart everything. Using the
char
device would probaly resolve only a part of the issue...

I'm not aware about your environment and i'm not a userland expert. But what i
still not
understand why a parent process can not do the job...
I just test a simple script on my side that treat the kill -9 of an application
("cat" in my case).

This is not entirely true, if the parent process is killed with a SIGKILL, then
the process will not be able to handle anything and the remoteproc will still
be running.

What I understood from Rishabh patch is a way to allow a single process handling
the rproc state. We have the same kind of need and currently, if the
user application crashes, then the rproc is still running (which happens).


#start the remote firmware
cp $1 /lib/firmware/
echo $1> /sys/class/remoteproc/remoteproc0/firmware
echo start >/sys/class/remoteproc/remoteproc0/state
#your binary
cat /dev/kmsg
# stop the remote firmware in case of crash (and potentially some other apps)
echo stop >/sys/class/remoteproc/remoteproc0/state


This is not really "production proof" and what happens if the application is
responsible of setting the firmware which might be jitted ?
And if the script receives the SIGKILL, then we are back to the same problem.
Yes this is just a basic example, not an implementation which would depend on
the
environment. i'm just trying here to put forward a multi-process solution...and
that I'm not an userland expert :).


I really think, this is a step forward an easier and reliable use of the
remoteproc
on userland to guarantee a coherent rproc state even if host application
crashes.
Yes what we want is simple mechanism where a single userspace process can boot/
shutdown the remote processor in all scenarios. Adding more processes to monitor
the already existing process might have 2 issues. One is there might be a delay
between the application crash and process monitor getting to know about it and taking
action. This might prove to be fatal in our case. Second, possibly the monitor can hang
or get killed and is not deterministic.

I can see 3 ways of handling an application crash:
- just shutdown the firmware
=> can be done through char device
- stop some other related processes and/or generate a remote proc crash dump for
debug
=> /sysfs and/or debugfs
- do nothing as you want a silence application reboot and re-attach to the
running firmware
=> use sysfs

I'm challenging the solution because splitting the API seems to me not a good
solution.

Completely ok with that, we have to fully understand the targeted usecase to
avoid implemented a flawed interface.

Now i wonder how it works for the other applications that are relying on some
other
kernel frameworks...

For some other device, there is a chardev. The watchdog for intance uses a
/dev/watchdog. Regarding the gpio, it seems they are also using a chardev
and the sysfs interface is deprecated.

Perhaps the answer is that these frameworks don't use sysfs but char device.
That would means that the sysfs solution is not the more adapted solution and
perhaps we should migrate to a char device.
But in this case, i think that it should implement the whole API and be
exclusive with
the syfs legacy API (so no sysfs or sysfs in read-only).

I agree with that, if another interface must be defined, then it should
implement everything that is supported right now with the sysfs.

The other fields that sysfs exposes right now are firmware_name, name(rproc name),
state. The targeted usecase was that these are configuration parameters specific
to the remoteproc and should stay in the sysfs interface. Whereas char device
should provide direct access to remoteproc device.
It would make sense to use this interface in conjunction with sysfs
interface, where you use /dev/remoteproc0 to boot/shutdown the remote processor
sysfs entries to fine tune the parameters.
Adding ioctls to implement all sysfs functionality seems like overkill to me. Let
me know what you guys think.

ClÃment


Regards,
Arnaud


Regards,

ClÃment

Anyway, it's just my feeling, let other people give their feedback.

I just want to be sure that there is no alternative to this, because
having two ways
for application to shutdown the firmware seems to me confusing...
Does making this interface optional/configurable helps?

What about the opposite service, mean inform the application that the remote
processor is crashed?
Do you identify such need? or the "auto" crash recovery is sufficient?
Auto recovery works perfectly for us. Although there is a mechanism in
place using QMI(Qualcomm MSM interface) that can notify clients about remote
processor crash.

Thanks for the information.

Regards
Arnaud


Thanks,
Arnaud

Signed-off-by: Rishabh Bhatnagar <rishabhb@xxxxxxxxxxxxxx>
---
Âdrivers/remoteproc/KconfigÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂ 9 +++
Âdrivers/remoteproc/MakefileÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂ 1 +
Âdrivers/remoteproc/remoteproc_cdev.cÂÂÂÂ | 100 +++++++++++++++++++++++++++++++
Âdrivers/remoteproc/remoteproc_internal.h |Â 22 +++++++
Âinclude/linux/remoteproc.hÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂ 2 +
Â5 files changed, 134 insertions(+)
Âcreate mode 100644 drivers/remoteproc/remoteproc_cdev.c

diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig
index de3862c..6374b79 100644
--- a/drivers/remoteproc/Kconfig
+++ b/drivers/remoteproc/Kconfig
@@ -14,6 +14,15 @@ config REMOTEPROC

Âif REMOTEPROC

+config REMOTEPROC_CDEV
+ÂÂÂ bool "Remoteproc character device interface"
+ÂÂÂ help
+ÂÂÂÂÂ Say y here to have a character device interface for Remoteproc
+ÂÂÂÂÂ framework. Userspace can boot/shutdown remote processors through
+ÂÂÂÂÂ this interface.
+
+ÂÂÂÂÂ It's safe to say N if you don't want to use this interface.
+
Âconfig IMX_REMOTEPROC
ÂÂÂÂ tristate "IMX6/7 remoteproc support"
ÂÂÂÂ depends on ARCH_MXC
diff --git a/drivers/remoteproc/Makefile b/drivers/remoteproc/Makefile
index e30a1b1..b7d4f77 100644
--- a/drivers/remoteproc/Makefile
+++ b/drivers/remoteproc/Makefile
@@ -9,6 +9,7 @@ remoteproc-yÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ += remoteproc_debugfs.o
Âremoteproc-yÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ += remoteproc_sysfs.o
Âremoteproc-yÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ += remoteproc_virtio.o
Âremoteproc-yÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ += remoteproc_elf_loader.o
+obj-$(CONFIG_REMOTEPROC_CDEV)ÂÂÂÂÂÂÂ += remoteproc_cdev.o
Âobj-$(CONFIG_IMX_REMOTEPROC)ÂÂÂÂÂÂÂ += imx_rproc.o
Âobj-$(CONFIG_MTK_SCP)ÂÂÂÂÂÂÂÂÂÂÂ += mtk_scp.o mtk_scp_ipi.o
Âobj-$(CONFIG_OMAP_REMOTEPROC)ÂÂÂÂÂÂÂ += omap_remoteproc.o
diff --git a/drivers/remoteproc/remoteproc_cdev.c
b/drivers/remoteproc/remoteproc_cdev.c
new file mode 100644
index 0000000..8182bd1
--- /dev/null
+++ b/drivers/remoteproc/remoteproc_cdev.c
@@ -0,0 +1,100 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Character device interface driver for Remoteproc framework.
+ *
+ * Copyright (c) 2020, The Linux Foundation. All rights reserved.
+ */
+
+#include <linux/cdev.h>
+#include <linux/fs.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/remoteproc.h>
+
+#include "remoteproc_internal.h"
+
+#define NUM_RPROC_DEVICESÂÂÂ 64
+static dev_t rproc_cdev;
+static DEFINE_IDA(cdev_minor_ida);
+
+static int rproc_cdev_open(struct inode *inode, struct file *file)
+{
+ÂÂÂ struct rproc *rproc;
+
+ÂÂÂ rproc = container_of(inode->i_cdev, struct rproc, char_dev);
+
+ÂÂÂ if (!rproc)
+ÂÂÂÂÂÂÂ return -EINVAL;
+
+ÂÂÂ if (rproc->state == RPROC_RUNNING)
+ÂÂÂÂÂÂÂ return -EBUSY;
+
+ÂÂÂ return rproc_boot(rproc);
+}
+
+static int rproc_cdev_release(struct inode *inode, struct file *file)
+{
+ÂÂÂ struct rproc *rproc;
+
+ÂÂÂ rproc = container_of(inode->i_cdev, struct rproc, char_dev);
+
+ÂÂÂ if (!rproc || rproc->state != RPROC_RUNNING)
+ÂÂÂÂÂÂÂ return -EINVAL;
+
+ÂÂÂ rproc_shutdown(rproc);
+
+ÂÂÂ return 0;
+}
+
+static const struct file_operations rproc_fops = {
+ÂÂÂ .open = rproc_cdev_open,
+ÂÂÂ .release = rproc_cdev_release,
+};
+
+int rproc_char_device_add(struct rproc *rproc)
+{
+ÂÂÂ int ret, minor;
+ÂÂÂ dev_t cdevt;
+
+ÂÂÂ minor = ida_simple_get(&cdev_minor_ida, 0, NUM_RPROC_DEVICES,
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ GFP_KERNEL);
+ÂÂÂ if (minor < 0) {
+ÂÂÂÂÂÂÂ dev_err(&rproc->dev, "%s: No more minor numbers left! rc:%d\n",
+ÂÂÂÂÂÂÂÂÂÂÂ __func__, minor);
+ÂÂÂÂÂÂÂ return -ENODEV;
+ÂÂÂ }
+
+ÂÂÂ cdev_init(&rproc->char_dev, &rproc_fops);
+ÂÂÂ rproc->char_dev.owner = THIS_MODULE;
+
+ÂÂÂ cdevt = MKDEV(MAJOR(rproc_cdev), minor);
+ÂÂÂ ret = cdev_add(&rproc->char_dev, cdevt, 1);
+ÂÂÂ if (ret < 0)
+ÂÂÂÂÂÂÂ ida_simple_remove(&cdev_minor_ida, minor);
+
+ÂÂÂ rproc->dev.devt = cdevt;
+ÂÂÂ return ret;
+}
+
+void rproc_char_device_remove(struct rproc *rproc)
+{
+ÂÂÂ __unregister_chrdev(MAJOR(rproc->dev.devt), MINOR(rproc->dev.devt), 1,
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ "rproc");
+ÂÂÂ ida_simple_remove(&cdev_minor_ida, MINOR(rproc->dev.devt));
+}
+
+void __init rproc_init_cdev(void)
+{
+ÂÂÂ int ret;
+
+ÂÂÂ ret = alloc_chrdev_region(&rproc_cdev, 0, NUM_RPROC_DEVICES, "rproc");
+ÂÂÂ if (ret < 0) {
+ÂÂÂÂÂÂÂ pr_err("Failed to alloc rproc_cdev region, err %d\n", ret);
+ÂÂÂÂÂÂÂ return;
+ÂÂÂ }
+}
+
+void __exit rproc_exit_cdev(void)
+{
+ÂÂÂ __unregister_chrdev(MAJOR(rproc_cdev), 0, NUM_RPROC_DEVICES, "rproc");
+}
diff --git a/drivers/remoteproc/remoteproc_internal.h
b/drivers/remoteproc/remoteproc_internal.h
index 493ef92..28d61a1 100644
--- a/drivers/remoteproc/remoteproc_internal.h
+++ b/drivers/remoteproc/remoteproc_internal.h
@@ -47,6 +47,27 @@ struct dentry *rproc_create_trace_file(const char *name,
struct rproc *rproc,
Âint rproc_init_sysfs(void);
Âvoid rproc_exit_sysfs(void);

+#ifdef CONFIG_REMOTEPROC_CDEV
+void rproc_init_cdev(void);
+void rproc_exit_cdev(void);
+int rproc_char_device_add(struct rproc *rproc);
+void rproc_char_device_remove(struct rproc *rproc);
+#else
+static inline void rproc_init_cdev(void)
+{
+}
+static inline void rproc_exit_cdev(void)
+{
+}
+static inline int rproc_char_device_add(struct rproc *rproc)
+{
+ÂÂÂ return 0;
+}
+static inline void rproc_char_device_remove(struct rproc *rproc)
+{
+}
+#endif
+
Âvoid rproc_free_vring(struct rproc_vring *rvring);
Âint rproc_alloc_vring(struct rproc_vdev *rvdev, int i);

@@ -63,6 +84,7 @@ struct resource_table *rproc_elf_find_loaded_rsc_table(struct
rproc *rproc,
Âstruct rproc_mem_entry *
Ârproc_find_carveout_by_name(struct rproc *rproc, const char *name, ...);

+
Âstatic inline
Âint rproc_fw_sanity_check(struct rproc *rproc, const struct firmware *fw)
Â{
diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
index 16ad666..c4ca796 100644
--- a/include/linux/remoteproc.h
+++ b/include/linux/remoteproc.h
@@ -37,6 +37,7 @@

Â#include <linux/types.h>
Â#include <linux/mutex.h>
+#include <linux/cdev.h>
Â#include <linux/virtio.h>
Â#include <linux/completion.h>
Â#include <linux/idr.h>
@@ -514,6 +515,7 @@ struct rproc {
ÂÂÂÂ bool auto_boot;
ÂÂÂÂ struct list_head dump_segments;
ÂÂÂÂ int nb_vdev;
+ÂÂÂ struct cdev char_dev;
Â};

>>>>> Â/**