Re: [RFC PATCH 2/4] DRM: Add support of AI Processor Unit (APU)

From: Christian König
Date: Thu Sep 23 2021 - 02:17:52 EST


Am 23.09.21 um 02:58 schrieb Dave Airlie:
On Sat, 18 Sept 2021 at 07:57, Alexandre Bailon <abailon@xxxxxxxxxxxx> wrote:
Some Mediatek SoC provides hardware accelerator for AI / ML.
This driver provides the infrastructure to manage memory
shared between host CPU and the accelerator, and to submit
jobs to the accelerator.
The APU itself is managed by remoteproc so this drivers
relies on remoteproc to found the APU and get some important data
from it. But, the driver is quite generic and it should possible
to manage accelerator using another ways.
This driver doesn't manage itself the data transmitions.
It must be registered by another driver implementing the transmitions.

Signed-off-by: Alexandre Bailon <abailon@xxxxxxxxxxxx>
[SNIP]

Please refer to
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.kernel.org%2Fdoc%2FDocumentation%2Fioctl%2Fbotching-up-ioctls.rst&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C53a0ef2630404ddc4d9408d97e2d409c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637679555123878415%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=6oVXAAOjQX%2FnDzJxZIAALqjDourHdrdGF6QVQKR58KI%3D&amp;reserved=0

here and below in many places.

There's a lot of missing padding/alignment here.

There is also the pahole utility which show you nicely where you need padding for your IOCTL structures.

For example "pahole drivers/gpu/drm/amd/amdgpu/amdgpu.ko -C drm_amdgpu_gem_va" gives you:

struct drm_amdgpu_gem_va {
    __u32                      handle;               /*     0     4 */
    __u32                      _pad;                 /*     4     4 */
    __u32                      operation;            /*     8     4 */
    __u32                      flags;                /*    12     4 */
    __u64                      va_address;           /*    16     8 */
    __u64                      offset_in_bo;         /*    24     8 */
    __u64                      map_size;             /*    32     8 */

    /* size: 40, cachelines: 1, members: 7 */
    /* last cacheline: 40 bytes */
};

And as you can see we have added the _pad field to our IOCTL parameter structure to properly align the 64bit members.

Regards,
Christian.


I'm trying to find the time to review this stack in full, any writeups
on how this is used from userspace would be useful (not just the code
repo, but some sort of how do I get at it) it reads as kinda generic
(calling it apu), but then has some specifics around device binding.

Dave.