Re: [PATCH v3 0/2] efi/x86: Call set_os() protocol on dual GPU Macs

From: Aditya Garg
Date: Wed Jul 24 2024 - 12:27:17 EST




> On 24 Jul 2024, at 9:31 PM, Lukas Wunner <lukas@xxxxxxxxx> wrote:
>
> On Tue, Jul 23, 2024 at 04:25:19PM +0000, Aditya Garg wrote:
>>> On Wed, Jul 17, 2024 at 04:35:15PM +0000, Aditya Garg wrote:
>>> For the Macs having a single GPU, in case a person uses an eGPU,
>>> they still need this apple-set-os quirk for hybrid graphics.
>>
>> Sending this message again as for some reason it got sent only to Lukas:
>>
>> Full model name: Mac mini (2018) (Macmini8,1)
>>
>> The drive link below has the logs:
>>
>> https://drive.google.com/file/d/1P3-GlksU6WppvzvWC0A-nAoTZh7oPPxk/view?usp=drive_link
>
> Some observations:
>
> * dmesg-with-egpu.txt: It seems the system was actually booted *without*
> an eGPU, so the filename appears to be a misnomer.
>
> * The two files in the with_apple_set_os_efi directory only contain
> incomplete dmesg output. Boot with log_buf_len=16M to solve this.
> Fortunately the truncated log is sufficient to see what's going on.
>
> * If the apple_set_os protocol is not used, the attached eGPU is not
> enumerated by the kernel on boot and a rescan is required.
> So neither the iGPU nor the eGPU are working. The reason is BIOS
> sets up incorrect bridge windows for the Thunderbolt host controller:
> Its two downstream ports' 64-bit windows overlap. The 32-bit windows
> do not overlap. If apple_set_os is used, the eGPU is using the
> (non-overlapping) 32-bit window. If apple_set_os is not used,
> the attached eGPU is using the (overlapping, hence broken) 64-bit window.
>
> So not only is apple_set_os needed to keep the iGPU enabled,
> but also to ensure BIOS sets up bridge windows in a manner that is
> only halfway broken and not totally broken.
>
> Below, 0000:06:01.0 and 0000:06:04.0 are the downstream ports on the
> Thunderbolt host controller and 0000:09:00.0 is the upstream port of
> the attached eGPU.
>
> iGPU enabled, no eGPU attached (dmesg.txt):
> pci 0000:06:01.0: bridge window [mem 0x81900000-0x888fffff]
> pci 0000:06:01.0: bridge window [mem 0xb1400000-0xb83fffff 64bit pref]
> pci 0000:06:04.0: bridge window [mem 0x88900000-0x8f8fffff]
> pci 0000:06:04.0: bridge window [mem 0xb8400000-0xbf3fffff 64bit pref]
>
> iGPU disabled, eGPU attached, apple_set_os not used (journalctl.txt):
> pci 0000:06:01.0: bridge window [mem 0x81900000-0x888fffff]
> pci 0000:06:01.0: bridge window [mem 0xb1400000-0xc6ffffff 64bit pref]
> pci 0000:06:04.0: bridge window [mem 0x88900000-0x8f8fffff]
> pci 0000:06:04.0: bridge window [mem 0xb8400000-0xbf3fffff 64bit pref]
> pci 0000:06:04.0: bridge window [mem 0xb8400000-0xbf3fffff 64bit pref]: can't claim; address conflict with PCI Bus 0000:09 [mem 0xb1400000-0xbf3fffff 64bit pref]
>
> iGPU enabled, eGPU attached, apple_set_os used (working-journalctl.txt):
> pci 0000:06:01.0: bridge window [mem 0x81900000-0x888fffff]
> pci 0000:06:01.0: bridge window [mem 0xb1400000-0xc6ffffff 64bit pref]
> pci 0000:06:04.0: bridge window [mem 0x88900000-0x8f8fffff]
> pci 0000:06:04.0: bridge window [mem 0xb8400000-0xbf3fffff 64bit pref]
> pci 0000:09:00.0: bridge window [mem 0x81900000-0x81cfffff]
>
> * As to how we can solve this and keep using apple_set_os only when
> necessary:
>
> I note that on x86, the efistub walks over all PCI devices in the system
> (see setup_efi_pci() in drivers/firmware/efi/libstub/x86-stub.c) and
> retrieves the Device ID and Vendor ID. We could additionally retrieve
> the Class Code and count the number of GPUs in the system by checking
> whether the Class Code matches PCI_BASE_CLASS_DISPLAY. If there's
> at least 2 GPUs in the system, invoke apple_set_os.

This also looks like a good idea, but I'm not well aware of the pci quirks in the Linux kernel. So, would consider it a bug report for the maintainers to fix.
>
> The question is whether this is needed on *all* Apple products or only
> on newer ones. I suspect that the eGPU issue may be specific to
> recent products. Ideally we'd find someone with a Haswell or Ivy Bridge
> era Mac Mini and an eGPU who could verify whether apple_set_os is needed
> on older models as well.
>
> We could constrain apple_set_os to newer models by checking for
> presence of the T2 PCI device [106b:1802]. Alternatively, we could
> use the BIOS date (DMI_BIOS_DATE in SMBIOS data) to enforce a
> cut-off such that only machines with a recent BIOS use apple_set_os.
>
> Thanks,
>
> Lukas