Re: [PATCH] fbdev: wait for references go away
From: Daniel Vetter
Date: Tue Jan 28 2020 - 11:58:42 EST
On Tue, Jan 28, 2020 at 5:44 PM Daniel Vetter <daniel@xxxxxxxx> wrote:
>
> On Tue, Jan 28, 2020 at 5:39 PM Daniel Vetter <daniel@xxxxxxxx> wrote:
> >
> > On Mon, Jan 20, 2020 at 11:00 AM Gerd Hoffmann <kraxel@xxxxxxxxxx> wrote:
> > >
> > > Problem: do_unregister_framebuffer() might return before the device is
> > > fully cleaned up, due to userspace having a file handle for /dev/fb0
> > > open. Which can result in drm driver not being able to grab resources
> > > (and fail initialization) because the firmware framebuffer still holds
> > > them. Reportedly plymouth can trigger this.
> > >
> > > Fix this by trying to wait until all references are gone. Don't wait
> > > forever though given that userspace might keep the file handle open.
> > >
> > > Reported-by: Marek Marczykowski-GÃrecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
> > > Signed-off-by: Gerd Hoffmann <kraxel@xxxxxxxxxx>
> >
> > (Missed this because lca, so a bit late)
> >
> > This isn't really how driver unload is supposed to happen. Instead:
> >
> > - Driver unload starts
> > - Driver calls the foo_unregister function, which stops new userspace
> > from getting at the driver. If you're subsystem is good (i.e. drm
> > since Noralf fixed it) this will also sufficiently synchronize with
> > any pending ioctl.
> > - Important: This does _not_ wait until userspace closes all
> > references. You can't force that.
> > - Driver releases all hw structures and mappings and everything else.
> > With fbdev this is currently not fully race free because no one is
> > synchronizing with userspace everywhere correctly.
> >
> > ... much time can pass ...
> >
> > - Userspace releases the last references, which triggers the final
> > destroy stuff and which releases the memory occupied by various
> > structures still (but not anything releated to hw or anything else
> > really).
> >
> > So there's two bits:
> >
> > 1. Synchronizing with pending ioctls. This is mostly there already
> > with lock_fb_info/unlock_fb_info. From a quick look the missing bit
> > seems to be that the unregister code is not taking that lock, and so
> > not sufficiently synchronizing against concurrent ioctl calls and
> > other stuff. Plus would need to audit all entry points.
>
> Correction: The check here is file_fb_info(), which checks for
> unregister. Except it's totally racy and misses the end marker (unlike
> drm_dev_enter/exit in drm). So bunch of work to do here too. The
> lock_fb_info is purely locking, not lifetime (and I think in a bunch
> of places way too late).
Oh and since your patch is full of races too I expect you'll be able
to get away with just adding the ->fb_remove hook, fix up drivers, and
leave fixing the various races to someone else.
-Daniel
>
> > 1a. fbcon works differently. Don't look too closely, but this is also
> > not the problem your facing here.
> >
> > 2. Refcounting of the fb structure and hw teardown. That's what's
> > tracked in fb_info->count. Most likely the fbdev driver you have has a
> > wrong split between the hw teardown code and what's in fb_destroy. If
> > you have any hw cleanup code in fb_destroy that driver is buggy. efifb
> > is very buggy in that area :-) Same for offb, simplefb, vesafb and
> > vesa16fb.
> >
> > We might need a new fb_unregister callback for these drivers to be
> > able to fix this properly. Because the unregister comes from the fbdev
> > core, and not the driver as usual, so the usual driver unload sequence
> > doesnt work:
> >
> > drm_dev_unregister();
> > ... release all hw resource ...
> >
> > drm_dev_put();
> >
> > Or in terms of fbdev:
> >
> > unregister_framebuffer(info);
> > ... release all hw resources ... <- everyone gets this wrong
> > framebuffer_release(info); <- also wrong because not refcounted,
> > hooray, this should be moved to to end of the ->fb_destroy callback
> >
> > So we need a callback to put the "release all hw resources" step into
> > the flow at the right place. Another option (slightly less midlayer)
> > would be to add a fb_takeover hook, for these platforms drivers, which
> > would then do the above sequence (like at driver unload).
> >
> > Also adding Noralf, since he's fixed up all the drm stuff in this area
> > in the past.
> >
> > Cheers, Daniel
> >
> > > ---
> > > drivers/video/fbdev/core/fbmem.c | 7 +++++++
> > > 1 file changed, 7 insertions(+)
> > >
> > > diff --git a/drivers/video/fbdev/core/fbmem.c b/drivers/video/fbdev/core/fbmem.c
> > > index d04554959ea7..2ea8ac05b065 100644
> > > --- a/drivers/video/fbdev/core/fbmem.c
> > > +++ b/drivers/video/fbdev/core/fbmem.c
> > > @@ -35,6 +35,7 @@
> > > #include <linux/fbcon.h>
> > > #include <linux/mem_encrypt.h>
> > > #include <linux/pci.h>
> > > +#include <linux/delay.h>
> > >
> > > #include <asm/fb.h>
> > >
> > > @@ -1707,6 +1708,8 @@ static void unlink_framebuffer(struct fb_info *fb_info)
> > >
> > > static void do_unregister_framebuffer(struct fb_info *fb_info)
> > > {
> > > + int limit = 100;
> > > +
> > > unlink_framebuffer(fb_info);
> > > if (fb_info->pixmap.addr &&
> > > (fb_info->pixmap.flags & FB_PIXMAP_DEFAULT))
> > > @@ -1726,6 +1729,10 @@ static void do_unregister_framebuffer(struct fb_info *fb_info)
> > > fbcon_fb_unregistered(fb_info);
> > > console_unlock();
> > >
> > > + /* try wait until all references are gone */
> > > + while (atomic_read(&fb_info->count) > 1 && --limit > 0)
> > > + msleep(10);
> > > +
> > > /* this may free fb info */
> > > put_fb_info(fb_info);
> > > }
> > > --
> > > 2.18.1
> > >
> > > _______________________________________________
> > > dri-devel mailing list
> > > dri-devel@xxxxxxxxxxxxxxxxxxxxx
> > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> >
> >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
>
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch