Re: [PATCH v2] firmware: fix sending -ERESTARTSYS due to signal on fallback
From: Luis R. Rodriguez
Date: Tue Jun 06 2017 - 13:53:15 EST
Used wrong alias for fsdevel now, its linux-fsdevel ...
Luis
On Tue, Jun 06, 2017 at 06:34:01PM +0200, Luis R. Rodriguez wrote:
> Adding fsdevel for review on the correct semantics of handling signals on
> write(), in this case a sysfs write which triggered a sync request firmware
> call and what the firmware API should return in such case of a signal (I gather
> this should be -EINTR and not -ERESTARTSYS). Also whether or not SIGINT should
> be followed or if only allowing SIGKILL is fine (fine by me, but it would
> change old behaviour).
>
> Hoping between fsdevel and linux-api folks we can hash this out.
>
> On Tue, Jun 06, 2017 at 11:04:37AM +0200, Martin Fuzzey wrote:
> > On 05/06/17 22:24, Luis R. Rodriguez wrote:
> > >
> > >
> > > For these two reasons then it would seem best we do two things actually:
> > >
> > > 1) return -EINTR instead of -EAGAIN when we detect swait_event_interruptible_timeout()
> > > got interrupted by a signal (it returns -ERESTARTSYS)
> >
> >
> > I disagree. That would force userspace to handle the signal rather than
> > having the kernel retry.
> >
> > From Documentation/DocBook/kernel-hacking.tmpl:
> >
> > After you slept you should check if a signal occurred: the
> > Unix/Linux way of handling signals is to temporarily exit the
> > system call with the <constant>-ERESTARTSYS</constant> error. The
> > system call entry code will switch back to user context, process
> > the signal handler and then your system call will be restarted
> > (unless the user disabled that). So you should be prepared to
> > process the restart, e.g. if you're in the middle of manipulating
> > some data structure.
>
> This applies but you are missing my point that the LWN article [0] I referred
> to also stated "Kernel code which uses interruptible sleeps must always check
> to see whether it woke up as a result of a signal, and, if so, clean up
> whatever it was doing and return -EINTR back to user space." -- I realize there
> may be contradiction with above documentation -- this perhaps can be clarified
> with fsdevel folks *but* regardless of that the same article notes Alan Cox
> explains that "Unix tradition (and thus almost all applications) believe file
> store writes to be non signal interruptible. It would not be safe or practical
> to change that guarantee." So for this reason alone there does seem to be an
> exemption to the above documentation worth noting for file store writes, and
> the patch which you tested below *moves* the sysfs write op for firmware in
> that direction by adding a new killable swait.
>
> [0] https://lwn.net/Articles/288056/
>
> > > 2) Do as you note below and add wait_event_killable_timeout()
> >
> > Hum, I do think that would be better but, (please correct me if I'm wrong)
> > the _killable_ variants only allow SIGKILL (and not SIGINT).
>
> That seems correct given a TASK_KILLABLE is also TASK_UNINTERRUPTIBLE.
>
> > 0cb64249ca "firmware_loader: abort request if wait_for_completion is
> > interrupted"
> >
> > specifically mentrions ctrl-c (SIGINT) in the commit message so that would
> > no longer work.
>
> Great point, but it *also* allowed SIGKILL, so I do feel the goal was also to
> allow it to be killable. I'm afraid that patch probably did not get proper
> review from sufficient folks and its worth now asking ourselves what we'd like
> to do. I'm fine with letting go of SIGINT for firmware sysfs calls for the
> sake of keeping with the long standing unix tradition on write, given we *still
> have SIGKILL*.
>
> > Myself I think having to use kill -9 to interrupt firmware loading by a
> > usespace helper is OK but others may disagree.
>
> Its why I added fsdevel as well. This is really a semantics and uapi question.
> Between fsdevel and linux-api folks I would hope we can come to a sensible
> resolution.
>
> > > I do not see why we could not introduce wait_event_killable_timeout()
> > > and swait_event_killable_timeout() into -stables.
> > > After seeing how simple it is to do so I tend to agree. Greg, Peter,
> > > what are your thoughts ?
> > >
> > > Martin Fuzzey can you test this patch as an alternative to your issue ?
> > >
> > > diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
> > > index b9f907eedbf7..70fc42e5e0da 100644
> > > --- a/drivers/base/firmware_class.c
> > > +++ b/drivers/base/firmware_class.c
> > > @@ -131,7 +131,7 @@ static int __fw_state_wait_common(struct fw_state *fw_st, long timeout)
> > > {
> > > long ret;
> > > - ret = swait_event_interruptible_timeout(fw_st->wq,
> > > + ret = swait_event_killable_timeout(fw_st->wq,
> > > __fw_state_is_done(READ_ONCE(fw_st->status)),
> > > timeout);
> > > if (ret != 0 && fw_st->status == FW_STATUS_ABORTED)
> > > diff --git a/include/linux/swait.h b/include/linux/swait.h
> > > index c1f9c62a8a50..9c5ca2898b2f 100644
> > > --- a/include/linux/swait.h
> > > +++ b/include/linux/swait.h
> > > @@ -169,4 +169,29 @@ do { \
> > > __ret; \
> > > })
> > > +#define __swait_event_killable(wq, condition) \
> > > + (void)___swait_event(wq, condition, TASK_KILLABLE, 0, schedule())
> > > +
> > > +#define swait_event_killable(wq, condition) \
> > > +({ \
> > > + int __ret = 0; \
> > > + if (!(condition)) \
> > > + __ret = __swait_event_killable(wq, condition); \
> > > + __ret; \
> > > +})
> > > +
> > > +#define __swait_event_killable_timeout(wq, condition, timeout) \
> > > + ___swait_event(wq, ___wait_cond_timeout(condition), \
> > > + TASK_INTERRUPTIBLE, timeout, \
> > > + __ret = schedule_timeout(__ret))
> > > +
> >
> > Should be TASK_KILLABLE above
>
> Oops yes sorry.
>
> > > +#define swait_event_killable_timeout(wq, condition, timeout) \
> > > +({ \
> > > + long __ret = timeout; \
> > > + if (!___wait_cond_timeout(condition)) \
> > > + __ret = __swait_event_killable_timeout(wq, \
> > > + condition, timeout); \
> > > + __ret; \
> > > +})
> > > +
> > > #endif /* _LINUX_SWAIT_H */
> > >
> > > Luis
> >
> > After replacing TASK_INTERRUPTIBLE with TASK_KILLABLE above it works for me.
>
> Great, thanks for testing.
>
> Luis
>
--
Luis Rodriguez, SUSE LINUX GmbH
Maxfeldstrasse 5; D-90409 Nuernberg