Am 21.12.2014 um 12:34 schrieb Oded Gabbay:There should be, but when the modules are compiled in, they are loaded based on link order only, if they are in the same group, and the groups are loaded by a pre-defined order.
On 12/21/2014 01:27 PM, Christian KÃnig wrote:
Am 20.12.2014 um 21:46 schrieb Oded Gabbay:
When amdkfd and radeon are compiled inside the kernel image (not as modules),
radeon will load before amdkfd and will set *kfd2kgd to its interface
structure. Therefore, we must not set *kfd2kgd to NULL when amdkfd is loaded
because it will override radeon's initialization and cause kernel BUG.
Signed-off-by: Oded Gabbay <oded.gabbay@xxxxxxx>
You should probably rather fix the dependency between the two modules to get an
determined load order instead of doing such nasty workarounds.
Christian.
The problem is that when modules are compiled inside the kernel, there is NO
determined load order and there is no mechanism to enforce that. If there
is/was such a mechanism, I would of course prefer to use it.
There should be an determined order based on the symbol use, otherwise
initializing most of the kernel modules wouldn't work as expected. For example
radeon depends on the drm module must be loaded before radeon is loaded.
IMO, that's a far-fetched guess. Could you point to something more specific ?
Actually, I don't understand why the kernel doesn't enforce the order
according to the use of exported symbols, like it does with modules.
Yeah, that's indeed rather strange. There must be something in the amdkfd code
which broke that somehow.
Actually no. The preferred order is amd_iommu_v2, amdkfd and radeon last. This is the order that happens when all three are built as modules. More accurately, radeon inits, but its init triggers amdkfd init, which triggers amd_iommu_v2 init. So before radeon reaches its probe stage, all the modules were initialized.
As far as I understand you the desired init order is radeon and amd_iommu_v2
first and then amdkfd, right?
amd_iommu_v2 and amdkfd blacklisted for automatically load and only load amdkfdAs said above, that's ok.
manually?
Well, the normal kernel module handling doesn't work when all modules are compiled in. I'm not a huge expert on this issue so I had Joerg Roedel help me analyze this (thanks Joerg) and he agreed that there is no enforcement of order in this case.
There will always be dependencies between kgd (radeon) and amdkfd and between
amdkfd and amd_iommu_v2. I don't think I can eliminate those dependencies, not
without a very complex solution. And the fact that this complex solution
occurs only in a very specific use case (all modules compiled in), makes me
less inclined to do that.
So I don't see it as a "nasty workaround". I would call it just a "workaround"
for a specific use case, which should be solved by a generic solution to the
kernel enforcing load orders.
The normal kernel module handling already should provide the correct init order,
so I would still call this a rather nasty workaround because we couldn't find
the underlying problem.
--
Christian.
Oded
---
drivers/gpu/drm/amd/amdkfd/kfd_module.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
index 95d5af1..236562f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
@@ -34,7 +34,7 @@
#define KFD_DRIVER_MINOR 7
#define KFD_DRIVER_PATCHLEVEL 0
-const struct kfd2kgd_calls *kfd2kgd;
+const struct kfd2kgd_calls *kfd2kgd = NULL;
static const struct kgd2kfd_calls kgd2kfd = {
.exit = kgd2kfd_exit,
.probe = kgd2kfd_probe,
@@ -84,14 +84,13 @@ EXPORT_SYMBOL(kgd2kfd_init);
void kgd2kfd_exit(void)
{
+ kfd2kgd = NULL;
}
static int __init kfd_module_init(void)
{
int err;
- kfd2kgd = NULL;
-
/* Verify module parameters */
if ((sched_policy < KFD_SCHED_POLICY_HWS) ||
(sched_policy > KFD_SCHED_POLICY_NO_HWS)) {