Re: [PATCH v2 2/7] drm/vkms: Add support for multy-planar framebuffers

From: Louis Chauvet
Date: Thu Feb 01 2024 - 12:38:53 EST

Next message: Alexandru Elisei: "Re: [PATCH RFC v3 31/35] khugepaged: arm64: Don't collapse MTE enabled VMAs"
Previous message: Alexandru Elisei: "Re: [PATCH RFC v3 30/35] arm64: mte: ptrace: Handle pages with missing tag storage"
Next in thread: Arthur Grillo: "Re: [PATCH v2 2/7] drm/vkms: Add support for multy-planar framebuffers"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hello,

I think there are some bugs in this implementation of multi-planar
support (and not mylty-planar, there is a typo in the title), especially
for the "new" drm_format_info description which uses block_w and block_h.

I will propose two patches [1] solving these issues and hopefully also
simplifying a bit the composition.

TBH I'm not an expert, it's my first ever contribution in DRM, so please
don't hesitate to correct me if you thin I missunderstood something, it
actually took me a bit of time to fully understand the whole series and
how it interacted with the rest of the vkms driver.

> -static size_t pixel_offset(const struct vkms_frame_info *frame_info, int x, int y)
> +static size_t pixel_offset(const struct vkms_frame_info *frame_info, int x, int y, size_t index)
> {
> struct drm_framebuffer *fb = frame_info->fb;
>
> - return fb->offsets[0] + (y * fb->pitches[0])
> - + (x * fb->format->cpp[0]);
> + return fb->offsets[index] + (y * fb->pitches[index])
> + + (x * fb->format->cpp[index]);
> }
>
> /*
> @@ -23,27 +23,25 @@ static size_t pixel_offset(const struct vkms_frame_info *frame_info, int x, int
> * @frame_info: Buffer metadata
> * @x: The x(width) coordinate of the 2D buffer
> * @y: The y(Heigth) coordinate of the 2D buffer
> + * @index: The index of the plane on the 2D buffer
> *
> * Takes the information stored in the frame_info, a pair of coordinates, and
> - * returns the address of the first color channel.
> - * This function assumes the channels are packed together, i.e. a color channel
> - * comes immediately after another in the memory. And therefore, this function
> - * doesn't work for YUV with chroma subsampling (e.g. YUV420 and NV21).
> + * returns the address of the first color channel on the desired index.
> */
> static void *packed_pixels_addr(const struct vkms_frame_info *frame_info,
> - int x, int y)
> + int x, int y, size_t index)
> {
> - size_t offset = pixel_offset(frame_info, x, y);
> + size_t offset = pixel_offset(frame_info, x, y, index);
>
> return (u8 *)frame_info->map[0].vaddr + offset;
> }

This implementation of packed_pixels_addr will only work with
block_w == block_h == 1. For packed or tiled formats we will need to use
x/y information to extract the correct address, and this address will not
be a single pixel. See below my explanation.

[...]

> @@ -130,17 +128,28 @@ void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state
> {
> struct pixel_argb_u16 *out_pixels = stage_buffer->pixels;
> struct vkms_frame_info *frame_info = plane->frame_info;
> - u8 *src_pixels = get_packed_src_addr(frame_info, y);
> + const struct drm_format_info *frame_format = frame_info->fb->format;
> int limit = min_t(size_t, drm_rect_width(&frame_info->dst), stage_buffer->n_pixels);
> + u8 *src_pixels[DRM_FORMAT_MAX_PLANES];
>
> - for (size_t x = 0; x < limit; x++, src_pixels += frame_info->fb->format->cpp[0]) {
> + for (size_t i = 0; i < frame_format->num_planes; i++)
> + src_pixels[i] = get_packed_src_addr(frame_info, y, i);
> +
> + for (size_t x = 0; x < limit; x++) {
> int x_pos = get_x_position(frame_info, limit, x);
>
> - if (drm_rotation_90_or_270(frame_info->rotation))
> - src_pixels = get_packed_src_addr(frame_info, x + frame_info->rotated.y1)
> - + frame_info->fb->format->cpp[0] * y;
> + if (drm_rotation_90_or_270(frame_info->rotation)) {
> + for (size_t i = 0; i < frame_format->num_planes; i++) {
> + src_pixels[i] = get_packed_src_addr(frame_info,
> + x + frame_info->rotated.y1, i);
> + src_pixels[i] += frame_format->cpp[i] * y;

I find the current rotation management a bit complex to understand. This
is not related to your patch, but as I had to understand this to create my
second patch, I think this could be significanlty simplified.

Please see the below comment about frame_format->cpp, it applies here too.
I think the "easy" way here is simply to reuse the method
get_packed_src_addr every time you need a new pixel.

> + }
> + }
>
> plane->pixel_read(src_pixels, &out_pixels[x_pos]);
> +

The usage of cpp and pointer to specific pixel only work for non-packed
and non-blocked pixels, but for example NV30 or Y0L0 need more
informations about the exact location of the pixel to convert and write
the correct pixel value (each pixel can't be referenced directly by a
pointer). For example NV30 uses 5 bytes to store 3 pixels (10 bits each),
so to access the "middle" one you need to read the 5 bytes and do a small
computation to extract it's value.

I think a simple solution to handle most cases would be to profide two
more parameters: the x and y positions of the pixel to copy, using
"absolute coordinates" (i.e x=0,y=0 means the first byte of the src
buffer, not the first pixel in the `drm_rect src`, this way the method
`pixel_read` can extract the correct value).

This way it become easy to manage "complex" pixel representations in this
loop: simply increment x/y and let the pixel_read method handle
everything.

The second patch I will send is doing this. And as explained before, it
will also simplify a lot the code related to rotation and translation (no
more switch case everywhere to add offset to x/y, it simply use drm_rect_*
helpers).

It's not optimal in term of performance (in some situation it will read
the same block multiple time to generate different pixels), but I
believe it still is an intersting trade-off.

In the future, if performance is actally critical, the whole composition
loop will have to be specialized for each pixel formats: some can be
treated line by line (as it's done today), but with blocks or packed
pixels it's more complex.

> + for (size_t i = 0; i < frame_format->num_planes; i++)
> + src_pixels[i] += frame_format->cpp[i];

This is likely working with format with block_w != 1, see explanation
above.

[...]

[1]: https://lore.kernel.org/dri-devel/20240201-yuv-v1-0-3ca376f27632@xxxxxxxxxxx/T/#t

--
Louis Chauvet, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

Next message: Alexandru Elisei: "Re: [PATCH RFC v3 31/35] khugepaged: arm64: Don't collapse MTE enabled VMAs"
Previous message: Alexandru Elisei: "Re: [PATCH RFC v3 30/35] arm64: mte: ptrace: Handle pages with missing tag storage"
Next in thread: Arthur Grillo: "Re: [PATCH v2 2/7] drm/vkms: Add support for multy-planar framebuffers"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]