Re: [PATCH] usb: gadget: f_fs: add "zombie" mode
From: Felipe Balbi
Date: Tue Oct 07 2014 - 11:28:48 EST
Hi,
On Tue, Oct 07, 2014 at 05:01:15PM +0200, Krzysztof Opasiak wrote:
> > > > Hi,
> > > >
> > > > On Mon, Oct 06, 2014 at 01:25:14PM +0200, Robert Baldyga wrote:
> > > >> Since we can compose gadgets from many functions, there is the
> > > >> problem related to gadget breakage while FunctionFS daemon
> > being
> > > >> closed. In some cases it's strongly desired to keep gadget
> > alive
> > > >> for a while, despite FunctionFS files are closed, to allow
> > another
> > > >> functions to complete some presumably critical operations.
> > > >>
> > > >> For this purpose this patch introduces "zombie" mode. It can
> > be
> > > >> enabled by setting mount option "zombie=1", and results with
> > > >> defering function closure to the moment of reopening ep0 file
> > or filesystem umount.
> > > >>
> > > >> When ffs->state == FFS_ZOMBIE:
> > > >> - function is still binded and visible to host,
> > > >> - setup requests are automatically stalled,
> > > >> - all another transfers are refused,
> > > >> - opening ep0 causes function close, and then FunctionFS is
> > ready for
> > > >> descriptors and string write,
> > > >> - umount of functionfs cause function close.
> > > >>
> > > >> Signed-off-by: Robert Baldyga <r.baldyga@xxxxxxxxxxx>
> > > >
> > > > Can you further explain how do you trigger this ? Do I
> > understand
> > > > correctly that you composed a gadget using configfs and that
> > gadget
> > > > has functionfs + another gadget ?
> > > >
> > >
> > > Yes, I compose configfs gadget from functionfs + another gadget,
> > and
> > > when functionfs daemon closes ep files, entire gadget get
> > disconnected
> > > from host. FFS function is userspace code so there is no way to
> > know
> > > when it will close files (it doesn't matter what is the reason of
> > this
> > > situation, it can be daemon logic, program breakage, process kill
> > or
> > > any other). So when we have another function in gadget which, for
> > > example, sends some amount of data, does some software update or
> > > implements some real-time functionality, we may want to keep the
> > > gadget connected despite FFS function is no longer functional. We
> > > can't just remove one of functions from gadget since it has been
> > > enumerated, so the only way we can do that is to make broken FFS
> > > function "zombie". It will be still visible to host but it will
> > no longer implement it's functionality.
> >
> > now that's an explanation. Can you update commit log with some of
> > this info (once we agree on how to go about fixing this) ?
> >
> > I'm not sure we should try to fix this. The only case where this
> > could trigger is if ffs daemon crashes and dies or somebody sends a
> > bogus signal to kill it.
> >
> > A function cannot communicate with the host if it isn't functional
> > and ffs depends on its userland daemon. If daemon is crashing, why
> > not print a big WARN("closed %s while connected to host\n") ? That
> > seems like it's as much as we can do from the kernel. Userland
> > should know that they can't have a buggy ffs daemon.
>
> It's not a problem of buggy ffs daemon. The problem is that there are
> some non deterministic mechanisms in userspace like OOM killer. FFS
> daemon can be written very well but if we are out of memory it may
> become a victim. In this case reliability of whole gadget hurts a lot.
>
> If it's going about WARN(). I'm not enthusiastic about it. Userspace
> process dies all the time, that's quite normal;) I don't think that it
> is good idea to generate a warning on kernel level when some process
> dies. Kernel should be resistant for such situations and know how to
> deal with them (maybe user could select exact behavior, but it should be
> done on kernel site)
yeah, and the way to deal with that is disconnecting from the host
because that USB function, can't be functional anymore. I mean, imagine
you try to e.g. unload pictures from your nice DSLR and that DSLR runs
Linux and implements MTP or PTP using FFS. Then ptpd dies and you're
still connected to the host so you can't know that something went wrong,
the camera just stoped sending you data. So you figure: well, it must
just be slow, I'll leave it here and go have a nap. Hours later and
nothing has changed, because ptpd is still missing.
If you disconnect from the host, however, user knows instantaneously
that something went wrong.
I don't think maintaining a "zombie" function is very nice. In fact, the
very reason for adding usb_function_activate/deactivate was exactly to
prevent us from ever connecting to a host with a non-working function.
--
balbi
Attachment:
signature.asc
Description: Digital signature