RE: [PATCH] PCI: hv: Allocate MMIO from above 4GB for the config window
From: Michael Kelley
Date: Sun Apr 05 2026 - 19:11:45 EST
From: Dexuan Cui <DECUI@xxxxxxxxxxxxx> Sent: Thursday, April 2, 2026 10:10 AM
>
> > From: Michael Kelley <mhklinux@xxxxxxxxxxx>
> > Sent: Wednesday, January 21, 2026 11:11 PM
> > ...
> > From: Dexuan Cui <decui@xxxxxxxxxxxxx> Sent: Wednesday, January 21,
> > 2026 6:04 PM
> > >
> > > There has been a longstanding MMIO conflict between the pci_hyperv
> > > driver's config_window (see hv_allocate_config_window()) and the
> > > hyperv_drm (or hyperv_fb) driver (see hyperv_setup_vram()): typically
> > > both get MMIO from the low MMIO range below 4GB; this is not an issue
> > > in the normal kernel since the VMBus driver reserves the framebuffer
> > > MMIO in vmbus_reserve_fb(), so the drm driver's hyperv_setup_vram()
> > > can always get the reserved framebuffer MMIO; however, a Gen2 VM's
> > > kdump kernel fails to reserve the framebuffer MMIO in vmbus_reserve_fb()
> > > because the screen_info.lfb_base is zero in the kdump kernel:
> > > the screen_info is not initialized at all in the kdump kernel, because the
> > > EFI stub code, which initializes screen_info, doesn't run in the case of kdump.
> >
> > I don't think this is correct. Yes, the EFI stub doesn't run, but screen_info
>
> Hi Michael, sorry for delaying the reply for so long! Now I think I should
> understand all the details.
>
> My earlier statement "the screen_info is not initialized at all in the kdump
> kernel" is not correct on x86, but I believe it's correct on ARM64. Please see
> my explanation below.
Sadly, I must agree. It's surprising, because it affects kexec scenarios that
don't include Hyper-V. On arm64 bare metal, if you kexec to a kernel configured
to run the efifb frame buffer driver, the driver won't load.
>
> > should be initialized in the kdump kernel by the code that loads the
> > kdump kernel into the reserved crash memory. See discussion in the commit
> > message for commit 304386373007.
> >
> > I wonder if commit a41e0ab394e4 broke the initialization of screen_info
> > in the kdump kernel. Or perhaps there is now a rev-lock between the kernel
> > with this commit and a new version of the user space kexec command.
>
> The commit
> a41e0ab394e4 ("sysfb: Replace screen_info with sysfb_primary_display")
> should be unrelated here.
Agreed.
>
> > There's a parameter to the kexec() command that governs whether it
> > uses the kexec_file_load() system call or the kexec_load() system call.
> > I wonder if that parameter makes a difference in the problem described
> > for this patch.
> >
> > I can't immediately remember if, when I was working on commit
> > 304386373007, I tested kdump in a Gen 2 VM with an NVMe OS disk to
> > ensure that MMIO space was properly allocated to the frame buffer
> > driver (either hyperv_fb or hyperv_drm). I'm thinking I did, but tomorrow
> > I'll check for any definitive notes on that.
> >
> > Michael
Evidently, I did not fully test an arm64 VM, or I would have seen that
screen_info was't being populated for the kdump kernel.
>
> If vmbus_reserve_fb() in the kdump kernel fails to reserve the framebuffer
> MMIO range due to a Gen2 VM's screen_info.lfb_base being 0, the MMIO
> conflict between hyperv_fb/hyperv_drm and hv_pci happens -- this is
> especially an issue if hv_pci is built-in and hyperv_fb/hyperv_drm is built
> as modules. vmbus_reserve_fb() should always succeed for a Gen1 VM, since
> it can always get the framebuffer MMIO base from the legacy PCI graphics
> device, so we only need to discuss Gen2 VMs here.
Agreed.
>
> When kdump-tools loads the kdump kernel into memory, the tool can
> accept any of the 3 parameters (e.g. I got the below via "man kexec" in
> Ubuntu 24.04):
>
> -s (--kexec-file-syscall)
> Specify that the new KEXEC_FILE_LOAD syscall should be used exclusively.
>
> -c (--kexec-syscall)
> Specify that the old KEXEC_LOAD syscall should be used exclusively.
>
> -a (--kexec-syscall-auto)
> Try the new KEXEC_FILE_LOAD syscall first and when it is not supported or the kernel does not understand the supplied im‐
> age fall back to the old KEXEC_LOAD interface.
>
> There is no one single interface that always works, so this is the default.
>
> KEXEC_FILE_LOAD is required on systems that use locked-down secure boot to verify the kernel signature. KEXEC_LOAD may be
> also disabled in the kernel configuration.
>
> KEXEC_LOAD is required for some kernel image formats and on architectures that do not implement KEXEC_FILE_LOAD.
>
> If none of the parameters are specified, the default may be -c, or -s
> or -a, depending on the distro and the version in use. We can run
> strace -f kdump-config reload 2>&1 | egrep 'kexec_file_load|kexec_load' to tell which syscall is being used.
>
> Old distro versions seem to use KEXEC_LOAD by default, and new distro
> versions tend to use KEXEC_FILE_LOAD by default, especially when
> Secure Boot is enabled (e.g. see /usr/sbin/kdump-config: kdump_load()
> in Ubuntu).
Agreed. I think I had seen that previously.
>
> In Ubuntu, we can explicitly specify one of the parameters in
> "/etc/default/kdump-tools", e.g. KDUMP_KEXEC_ARGS="-c -d".
>
> The -d is for debugging. I found it very useful: when we run
> "kdump-config show" or "kdump-config reload", we get very useful
> debug info with -d.
>
> On x86-64, with -c:
> The kdump-tools gets the framebuffer's MMIO base using
> ioctl(fd, FBIOGET_FSCREENINFO, ....): see the end of the email for
> an example program; kdump-tools then uses the KEXEC_LOAD syscall
> to set up the screen_info.lfb_base for the kdump kernel.
Thanks. While redoing some experiments yesterday, I found the
similar program that I had written a year ago to dump the ioctl results.
>
> The function in kdump-tools that gets the framebuffer MMIO base
> is kexec/arch/i386/x86-linux-setup.c: setup_linux_vesafb():
> https://git.kernel.org/pub/scm/utils/kernel/kexec/kexec-
> tools.git/tree/kexec/arch/i386/x86-linux-setup.c?h=v2.0.32#n133
>
> Unluckily, setup_linux_vesafb() only recognizes the vesafb
> driver in Linux kernel ("VESA VGA") and the efifb driver ("EFI VGA").
> It looks like normally arch_options.reuse_video_type is always 0.
>
> This means the kdump kernel's screen_info.lfb_base is 0, if
> hyperv_fb or hyperv_drm loads. In the past, for a Ubuntu kernel
> with CONFIG_FB_EFI=y, our workaround is blacklisting
> hyperv_fb or hyperv_drm, so /dev/fb0 is backed by efifb, and
> the screen_info.lfb_base is correctly set for kdump.
Hmmm. This worse than I thought for x86/x64. In fact, it means
a part of my commit message for 304386373007 is now wrong. I had
described everything as working when using the kexec_load() system
call because the FBIOGET_FSCREENINFO ioctl was returning a good
value for smem_start (at least with the hyperv_fb driver). But as you
point out further down, newer versions of the kexec user space program
are ignoring that smem_start value unless the driver is vesafb or efifb.
Was blacklisting hyperv_fb or hyperv_drm in the kdump kernel
a workaround we had promulgated in the past? My recollection
is vague. But no matter.
>
> However, now CONFIG_FB_EFI is not set in recent Ubuntu kernels:
> $ egrep
> 'CONFIG_FB_EFI|CONFIG_SYSFB|CONFIG_SYSFB_SIMPLEFB|CONFIG_DRM_SIMPLEDR
> M|CONFIG_DRM_HYPERV' /boot/config-6.8.0-1051-azure
> CONFIG_SYSFB=y
> CONFIG_SYSFB_SIMPLEFB=y
> CONFIG_DRM_SIMPLEDRM=y
> CONFIG_DRM_HYPERV=m
> # CONFIG_FB_EFI is not set
>
> So, with Ubuntu 22.04/24.04, -c can't avoid the MMIO conflict
> for Gen2 x86-64 VMs now, even if we blacklist hyperv_fb/hyperv_drm.
> Note: Ubuntu 20.04 uses an old version of the kdump-tools, so
> the statement is different there (see the later discussion below).
>
> hyperv_fb has been removed in the mainline kernel: see
> commit 40227f2efcfb ("fbdev: hyperv_fb: Remove hyperv_fb driver")
> so we no longer need to worry about it.
>
> Even if we modify setup_linux_vesafb() to support hyperv_drm,
> it still won't work, because the MMIO base is hidden by commit
> da6c7707caf3 ("fbdev: Add FBINFO_HIDE_SMEM_START flag")
Agreed.
>
> On x86-64, with -s:
> The KEXEC_FILE_LOAD syscall sets the kdump kernel's
> screen_info.lfb_base in the kernel: see
>
> "arch/x86/kernel/kexec-bzimage64.c"
> bzImage64_load
> setup_boot_parameters
> memcpy(¶ms->screen_info, &screen_info, sizeof(struct screen_info));
>
> so, as long as the first kernel's hyperv_drm doesn't relocate the
> MMIO base, kdump should work fine; if the MMIO base is relocated,
> currently hyperv_drm doesn't update the screen_info.lfb_base,
> so the kdump's efifb driver and hv_pci driver won't work. Normally
> hyperv_drm doesn't relocate the MMIO base, unless the user
> specifies a very high resolution and the required MMIO size
> exceeds the default 8MB reserved by vmbus_reserve_fb() -- let's
> ignore that scenario for now.
>
Agreed.
>
> On AMR64, with -c:
> The kdump-tools doesn't even open /dev/fb0 (we can confirm this by using
> strace or bpftrace), so the kdump kernel's screen_info.lfb_base ia always 0.
Agreed.
>
> On AMR64, with -s:
> "arch/arm64/kernel/kexec_image.c": image_load() doesn't set the
> params->screen_info, so the kdump kernel's screen_info.lfb_base ia always 0.
Agreed.
>
> To recap, with a recent mainline kernel (or the linux-azure kernels) that
> has 304386373007, my observation on Ubuntu 22.04 and 24.04 is:
> on x86-64, -c fails, but -s works.
> on ARM64, -c fails, and -s also fails.
>
> Note: the kdump-tools v2.0.18 in Ubuntu 20.04 doesn't have this commit:
> https://git.kernel.org/pub/scm/utils/kernel/kexec/kexec-
> tools.git/commit/?id=fb5a8792e6e4ee7de7ae3e06d193ea5beaaececc
> (Note the "return 0;" in setup_linux_vesafb())
> so, on x86-64, -c also works in Ubuntu 20.04, if hyperv_fb is used
> (-c still doesn't work if hyperv_drm is used due to da6c7707caf3).
Ah. That explains why I thought x86/x64 kdump was working with
hyperv_fb when working on commit 304386373007. I was testing with
kexec user space utility v2.0.18, which*does* propagate smem_start
from the ioctl to the loaded kdump image.
>
> With this patch
> "PCI: hv: Allocate MMIO from above 4GB for the config window",
> both -c and -s work on x86-64 and ARM64 due to no MMIO conflict,
> as long as there are no 32-bit PCI BARs (which should be true on
> Azure and on modern hosts.)
>
> With the patch, even if hyperv_drm relocates the framebuffer MMO
> base, there would still be no MMIO conflict because typically hyperv_drm
> gets its MMIO from below 4GB: it seems like vmbus_walk_resources()
> always finds the low MMIO range first and adds it to the beginning of the
> MMIO resources "hyperv_mmio", so presumably hyperv_drm would
> get MMIO from the low MMIO range.
>
> I'll update the commit message, add Matthew's and Krister's
> Tested-by's and post v2.
See my comments on v2 of your patch. I have a thought for a
slightly different approach to solve the problem.
Michael
>
> Thanks,
> Dexuan