Re: [PATCH 0/2] Change order of linkage in kernel makefiles for amdkfd

From: Laurent Pinchart
Date: Fri Dec 26 2014 - 04:19:00 EST


Hi Thierry,

On Thursday 25 December 2014 14:20:59 Thierry Reding wrote:
> On Mon, Dec 22, 2014 at 01:07:13PM +0200, Oded Gabbay wrote:
> > This small patch-set, was created to solve the bug described at
> > https://bugzilla.kernel.org/show_bug.cgi?id=89661 (Kernel panic when
> > trying use amdkfd driver on Kaveri). It replaces the previous patch-set
> > called [PATCH 0/3] Use workqueue for device init in amdkfd
> > (http://lists.freedesktop.org/archives/dri-devel/2014-December/074401.html
> > )
> >
> > That bug appears only when radeon, amdkfd and amd_iommu_v2 are compiled
> > inside the kernel (not as modules). In that case, the correct loading
> > order, as determined by the exported symbol used by each driver, is
> > not enforced anymore and the kernel loads them based on who was linked
> > first. That makes radeon load first, amdkfd second and amd_iommu_v2
> > third.
> >
> > Because the initialization of a device in amdkfd is initiated by radeon,
> > and can only be completed if amdkfd and amd_iommu_v2 were loaded and
> > initialized, then in the case mentioned above, this initalization fails
> > and there is a kernel panic as some pointers are not initialized but
> > used nontheless.
> >
> > To solve this bug, this patch-set moves iommu/ before gpu/ in
> > drivers/Makefile and also moves amdkfd/ before radeon/ in
> > drivers/gpu/drm/Makefile.
> >
> > The rationale is that in general, AMD GPU devices are dependent on AMD
> > IOMMU controller functionality to allow the GPU to access a process's
> > virtual memory address space, without the need for pinning the memory.
> > That's why it makes sense to initialize the iommu/ subsystem ahead of the
> > gpu/ subsystem.
>
> I strongly object to this patch set. This makes assumptions about how
> the build system influences probe order. That's bad because seemingly
> unrelated changes could easily break this in the future.
>
> We already have ways to solve this kind of dependency (driver probe
> deferral), and I think you should be using it to solve this particular
> problem rather than some linking order hack.

While I agree with you that probe deferral is the way to go, I believe linkage
ordering can still be used as an optimization to avoid deferring probe in the
most common cases. I'm thus not opposed to moving iommu/ earlier in link order
(provided we can properly test for side effects, as the jump is pretty large),
but not as a replacement for probe deferral.

> Coincidentally there's a separate thread currently going on that deals
> with IOMMUs and probe order. The solution being worked on is currently
> somewhat ARM-specific, so adding a couple of folks for visibility. It
> looks like we're going to need something more generic since this is a
> problem that even the "big" architectures need to solve.

--
Regards,

Laurent Pinchart

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/