Re: [PATCH 0/4] treewide: fix interrupted release
From: Daniel Vetter
Date: Tue Oct 15 2019 - 10:07:34 EST
On Mon, Oct 14, 2019 at 06:13:26PM +0200, Johan Hovold wrote:
> On Mon, Oct 14, 2019 at 10:48:47AM +0200, Daniel Vetter wrote:
> > On Fri, Oct 11, 2019 at 11:36:33AM +0200, Johan Hovold wrote:
> > > On Thu, Oct 10, 2019 at 03:50:43PM +0200, Daniel Vetter wrote:
> > > > On Thu, Oct 10, 2019 at 03:13:29PM +0200, Johan Hovold wrote:
> > > > > Two old USB drivers had a bug in them which could lead to memory leaks
> > > > > if an interrupted process raced with a disconnect event.
> > > > >
> > > > > Turns out we had a few more driver in other subsystems with the same
> > > > > kind of bug in them.
> > >
> > > > Random funny idea: Could we do some debug annotations (akin to
> > > > might_sleep) that splats when you might_sleep_interruptible somewhere
> > > > where interruptible sleeps are generally a bad idea? Like in
> > > > fops->release?
> > >
> > > There's nothing wrong with interruptible sleep in fops->release per se,
> > > it's just that drivers cannot return -ERESTARTSYS and friends and expect
> > > to be called again later.
> > Do you have a legit usecase for interruptible sleeps in fops->release?
> The tty layer depends on this for example when waiting for buffered
> writes to complete (something which may never happen when using flow
> > I'm not even sure killable is legit in there, since it's an fd, not a
> > process context ...
> It will be run in process context in many cases, and for ttys we're good
Huh, read it a bit, all the ->shutdown callbacks have void return type.
But there's indeed interruptible sleeps in there. Doesn't this break
userspace that expects that a close() actually flushes the tty?
Imo if you're ->release callbacks feels like it should do a wait to
guaranteed something userspace expects, then doing a
wait_interruptible/killable feels like a bug. Or alternatively, the wait
isn't really needed in the first place.
> > > The return value from release() is ignored by vfs, and adding a splat in
> > > __fput() to catch these buggy drivers might be overkill.
> > Ime once you have a handful of instances of a broken pattern, creating a
> > check for it (under a debug option only ofc) is very much justified.
> > Otherwise they just come back to life like the undead, all the time. And
> > there's a _lot_ of fops->release callbacks in the kernel.
> Yeah, you have a point.
> But take tty again as an example, the close tty operation called from
> release() is declared void so there's no propagated return value for vfs
> to check.
> It may even be better to fix up the 100 or so callbacks potentially
> returning non-zero and make fops->release void so that the compiler
> would help us catch any future bugs and also serve as a hint for
> developers that returning errnos from fops->release is probably not
> what you want to do.
> But that's a lot of churn of course.
Hm indeed ->release has int as return type. I guess that's needed for
file I/O errno and similar stuff ...
Still void return value doesn't catch funny stuff like doing interruptible
waits and occasionally failing if you have a process that likes to use
signals and also uses some library somewhere to do something. In graphics
we have that, with Xorg loving signals for various things.
Software Engineer, Intel Corporation