Re: [RFC] drm/atomic+msm: add helper to implement legacy dirtyfb

From: Rob Clark
Date: Wed Apr 04 2018 - 09:24:31 EST


On Wed, Apr 4, 2018 at 8:16 AM, Daniel Vetter <daniel@xxxxxxxx> wrote:
> On Wed, Apr 04, 2018 at 07:40:32AM -0400, Rob Clark wrote:
>> On Wed, Apr 4, 2018 at 5:56 AM, Daniel Vetter <daniel@xxxxxxxx> wrote:
>> > On Wed, Apr 04, 2018 at 11:10:08AM +0200, Thomas Hellstrom wrote:
>> >> On 04/04/2018 10:43 AM, Daniel Vetter wrote:
>> >> > On Wed, Apr 04, 2018 at 10:22:21AM +0200, Thomas Hellstrom wrote:
>> >> > > Hi,
>> >> > >
>> >> > > On 04/04/2018 08:58 AM, Daniel Vetter wrote:
>> >> > > > On Wed, Apr 4, 2018 at 12:42 AM, Rob Clark <robdclark@xxxxxxxxx> wrote:
>> >> > > > > Add an atomic helper to implement dirtyfb support. This is needed to
>> >> > > > > support DSI command-mode panels with x11 userspace (ie. when we can't
>> >> > > > > rely on pageflips to trigger a flush to the panel).
>> >> > > > >
>> >> > > > > To signal to the driver that the async atomic update needs to
>> >> > > > > synchronize with fences, even though the fb didn't change, the
>> >> > > > > drm_atomic_state::dirty flag is added.
>> >> > > > >
>> >> > > > > Signed-off-by: Rob Clark <robdclark@xxxxxxxxx>
>> >> > > > > ---
>> >> > > > > Background: there are a number of different folks working on getting
>> >> > > > > upstream kernel working on various different phones/tablets with qcom
>> >> > > > > SoC's.. many of them have command mode panels, so we kind of need a
>> >> > > > > way to support the legacy dirtyfb ioctl for x11 support.
>> >> > > > >
>> >> > > > > I know there is work on a proprer non-legacy atomic property for
>> >> > > > > userspace to communicate dirty-rect(s) to the kernel, so this can
>> >> > > > > be improved from triggering a full-frame flush once that is in
>> >> > > > > place. But we kinda needa a stop-gap solution.
>> >> > > > >
>> >> > > > > I had considered an in-driver solution for this, but things get a
>> >> > > > > bit tricky if userspace ands up combining dirtyfb ioctls with page-
>> >> > > > > flips, because we need to synchronize setting various CTL.FLUSH bits
>> >> > > > > with setting the CTL.START bit. (ie. really all we need to do for
>> >> > > > > cmd mode panels is bang CTL.START, but is this ends up racing with
>> >> > > > > pageflips setting FLUSH bits, then bad things.) The easiest soln
>> >> > > > > is to wrap this up as an atomic commit and rely on the worker to
>> >> > > > > serialize things. Hence adding an atomic dirtyfb helper.
>> >> > > > >
>> >> > > > > I guess at least the helper, with some small addition to translate
>> >> > > > > and pass-thru the dirty rect(s) is useful to the final atomic dirty-
>> >> > > > > rect property solution. Depending on how far off that is, a stop-
>> >> > > > > gap solution could be useful.
>> >> > > > Adding Noralf, who iirc already posted the full dirty helpers already somewhere.
>> >> > > > -Daniel
>> >> > > I've asked Deepak to RFC the core changes suggested for the full dirty blob
>> >> > > on dri-devel. It builds on DisplayLink's suggestion, with a simple helper to
>> >> > > get to the desired coordinates.
>> >> > >
>> >> > > One thing to perhaps discuss is how we would like to fit this with
>> >> > > front-buffer rendering and the dirty ioctl. In the page-flip context, the
>> >> > > dirty rects, like egl's swapbuffer_with_damage is a hint to restrict the
>> >> > > damage region that can be fully ignored by the driver, new content is
>> >> > > indicated by a new framebuffer.
>> >> > >
>> >> > > We could do the same for frontbuffer rendering: Either set a dirty flag like
>> >> > > you do here, or provide a content_age state member. Since we clear the dirty
>> >> > > flag on state copies, I guess that would be sufficient. The blob rectangles
>> >> > > would then become a hint to restrict the damage region.
>> >> > I'm not entirely following here - I thought for frontbuffer rendering the
>> >> > dirty rects have always just been a hint, and that the driver was always
>> >> > free to re-upload the entire buffer to the screen.
>> >> >
>> >> > And through a helper like Rob's proposing here (and have floated around in
>> >> > different versions already) we'd essentially map a frontbuffer dirtyfb
>> >> > call to a fake flip with dirty rect. Manual upload drivers already need to
>> >> > upload the entire screen if they get a flip, since some userspace uses
>> >> > that to flush out frontbuffer rendering (instead of calling dirtyfb).
>> >> >
>> >> > So from that pov the new dirty flag is kinda not necessary imo.
>> >> >
>> >> > > Another approach would be to have the presence of dirty rects without
>> >> > > framebuffer change to indicate frontbuffer rendering.
>> >> > >
>> >> > > I think I like the first approach best, although it may be tempting for
>> >> > > user-space apps to just set the dirty bit instead of providing the full
>> >> > > damage region.
>> >> > Or I'm not following you here, because I don't quite see the difference
>> >> > between dirtyfb and a flip.
>> >> > -Daniel
>> >>
>> >> OK, let me rephrase:
>> >>
>> >> From the driver's point-of-view, in the atomic world, new content and the
>> >> need for manual upload is indicated by a change in fb attached to the plane.
>> >>
>> >> With Rob's patch here, (correct me if I'm wrong) in addition new content and
>> >> the need for manual upload is identified by the dirty flag, (since the fb
>> >> stays the same and the driver thus never identifies a page-flip).
>> >
>> > Hm, I'm not entirely sure Rob's approach is correct. Imo a pageflip
>> > (atomic or not) should result in the entire buffer getting uploaded. The
>> > dirty flag is kinda redundant, a flip with the same buffer works the same
>> > way as a dirtyfb with the entire buffer as the dirty rectangle.
>>
>> Userspace could do a pageflip, but (with buffer-age extension) only
>> re-render part of the frame. So there is a use-case for full frame
>> pageflip with hints about what region(s) changed since previous frame.
>>
>> (Of course userspace could screw that up and get different results on
>> "command" vs "video" style display.. in that case, it gets to keep
>> both pieces)
>
> I'm not against dirty on flips with the same buffer as an optimization.
> Disagreement here is around whether a flip to the same buffer means:
> a) nothing changed, upload nothing
> b) entire buffer might have changed
>
> I'm proposing we standardize on b) (with maybe the exception of legacy
> cursor ioctls because those are special) and add the dirty rects as an
> option to optimize this more.

ok, I can agree w/ flip to same buffer, if FB_ID is in the incoming
property list.. I just want a way to differentiate that from "plane
got added to new state for unrelated reasons".. so new_fb != old_fb
isn't a good test for "should we synchronize", but new plane state
isn't either.

Keeping the dirty flag (or sequence count or whatever mechanism) to
indicate that an FB_ID that happened to be the same fb was part of
what userspace passed in would do the trick.

BR,
-R

> Either way you'd always have to follow the implicit fence (which is what
> your msm hunk seems to be doing, but only for dirtyfb).
> -Daniel
>
>>
>> BR,
>> -R
>>
>>
>> >> In both these situations, clip rects can provide a hint to restrict the
>> >> dirty region.
>> >>
>> >> Now I was thinking about the preferred way for user-space to communicate
>> >> front buffer rendering through the atomic ioctl:
>> >>
>> >> 1) Expose a dirty (or content_age property)
>> >> 2) Attach a clip-rect blob property.
>> >> 3) Fake a page-flip by ping-ponging two frame-buffers pointing to the same
>> >> underlying buffer object.
>> >>
>> >> Are you saying that people are already using 3) and we should keep using
>> >> that?
>> >
>> > I'm saying they're using 3b), flip the same buffer wrapped in the same
>> > drm_framebuffer, and expect it to work.
>> >
>> > The only advantage dirtyfb has is that it allows you to supply the
>> > optimized upload rectangles, but at the cost of a funny api (it's working
>> > on the fb, not the plane/crtc you want to upload) and lack of drm_event to
>> > confirm when exactly you uploaded your stuff. But imo they should be the
>> > same underlying operation.
>> >
>> > Also note that atomic helpers don't optimize out plane flips for same
>> > buffers. We only optimize out some of the waiting, in a failed attempt at
>> > making cursors stall less, but that's not fixed with the async plane
>> > update stuff. And we can obviously optimize out the prepare/cleanup hooks,
>> > because the buffer should be pinned already.
>> >
>> > So imo for a quick fix I think we need:
>> > 1) Fix drivers (or at least msm) to upload buffers for manual upload on
>> > all flips (well, all screen updates). Probably should do the fence waiting
>> > unconditionally too (and handle cursors with the new async stuff).
>> > 2) Tiny helper that remaps a dirtyfb call into a nonblocking atomic
>> > commit, which currently means we're throwing the dirty rect optimization
>> > into the wind. But that could easily be added to the drm_plane_state,
>> > without exposing it to userspace as a property just yet.
>> >
>> > Cheers, Daniel
>> > --
>> > Daniel Vetter
>> > Software Engineer, Intel Corporation
>> > http://blog.ffwll.ch
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@xxxxxxxxxxxxxxxxxxxxx
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch