Re: [PATCH v5 4/4] drm/fence: add out-fences support
From: Daniel Vetter
Date: Fri Oct 21 2016 - 08:57:47 EST
On Thu, Oct 20, 2016 at 10:15:20PM +0300, Ville Syrjälä wrote:
> On Thu, Oct 20, 2016 at 07:34:44PM +0300, Ville Syrjälä wrote:
> > On Thu, Oct 20, 2016 at 01:55:38PM -0200, Gustavo Padovan wrote:
> > > 2016-10-20 Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx>:
> > >
> > > > On Thu, Oct 20, 2016 at 12:50:05PM -0200, Gustavo Padovan wrote:
> > > > > From: Gustavo Padovan <gustavo.padovan@xxxxxxxxxxxxxxx>
> > > > >
> > > > > Support DRM out-fences by creating a sync_file with a fence for each CRTC
> > > > > that sets the OUT_FENCE_PTR property.
> > > >
> > > > I still maintain the out fence should also be per fb (well, per plane
> > > > since we can't add it to fbs). Otherwise it's not going to be at all
> > > > useful if you do fps>vrefresh and don't include the same set of planes
> > > > in every atomic ioctl, eg. if you only ever include ones that are
> > > > somehow dirty.
> > >
> > > How would the kernel signal these dirty planes then? Right now we signal
> > > at the vblank.
> >
> > So if I think about it in terms of the previous fbs something like this
> > comes to mind:
> >
> > starting point:
> > plane a, fb 0
> > plane b, fb 1
> >
> > ioctl:
> > plane a, fb 2, fence A
> > plane b, fb 3, fence B
> >
> > ioctl:
> > plane a, fb 4, fence C
> > -> fb 2 can be reused, so fence C should signal immediately?
> >
> > vblank:
> > -> fb 0,1 can be reused, so fence A,B signal?
> >
> > It feels a bit wonky since the fence is effectively associated with the
> > previous fb after the previous fb was already submitted to the kernel.
> > One might assume fence A to be the one signalled early, but that would
> > give the impression that fb 0 would be free for reuse when it's not.
>
> OK, so after hashing this out on irc with Gustavo and Brian, we came
> to the conclusion that the per-crtc out fence should be sufficient
> after all.
>
> The way it can work is that the first ioctl will return the fence,
> doesn't really matter how many of the planes are involved in this
> ioctl. Subsequent ioctls that manage to get in before the next
> vblank will get back a -1 as the fence, even if the set if planes
> involved is not the same as in the first ioctl. From the -1
> userspace can tell what happened, and can then consult its own
> records to see which still pending flips were overwritten, and
> thus knows which buffers can be reused immediately.
>
> If userspace plans on sending out the now free buffers to someone
> else accompanied by a fence, it can just create an already signalled
> fence to send instead of sending the fence it got back from the
> atomic ioctl since that one is still associated with the original
> scanout buffers.
>
> The fence returned by the atomic ioctl will only signal after the
> vblank, at which point userspace will obviously know that the
> orginal scanout buffers are now also ready for reuse, and that
> the last buffer submitted for each plane is now being actively
> scanned out. So if userspace wants to send out some of the
> original buffers to someone else, it can send along the fence
> returned from the atomic ioctl.
>
> So requires a bit more work from userspace perhaps. But obviously
> it only has to do this if it plans on submitting multiple operations
> per frame.
>
> So a slightly expanded version of my previous example might look like:
> starting point:
> plane a, fb 0
> plane b, fb 1
>
> ioctl:
> crtc: fence A
> plane a, fb 2
> plane b, fb 3
>
> ioctl:
> crtc: fence -1
> plane a, fb 4
> -> the previously submitted fb for plane a (fb 2) can be reused
>
> ioctl:
> crtc: fence -1
> plane a, fb 5
> -> the previously submitted fb for plane a (fb 4) can be reused
>
> vblank:
> -> fence A signals, fb 0,1 can be reused
> fb 3 and 5 being scanned out now
>
>
> We also had a quick side track w.r.t. variable refresh rate displays,
> and I think the conclusion was that there the vblank may just happen
> sooner or later. Nothing else should change. What's unclear is how
> the refresh rate would be controlled. The two obvious options are
> explicit control, and automagic based on the submit rate. But that's
> a separate topic entirely.
Thanks a lot for doing this discussion and the detailed write-up. Imo we
should bake this into the kerneldoc, as uabi documentation examples.
Gustavo, can you pls include this? I'd put it into the kerneldoc for
@out_fence, with inline struct comments we can be rather excessive in
their lenght ;-)
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch