RE: [PATCH] PCI: hv: Allocate MMIO from above 4GB for the config window
From: Dexuan Cui
Date: Thu Apr 02 2026 - 13:20:13 EST
> From: Michael Kelley <mhklinux@xxxxxxxxxxx>
> Sent: Wednesday, January 21, 2026 11:11 PM
> ...
> From: Dexuan Cui <decui@xxxxxxxxxxxxx> Sent: Wednesday, January 21,
> 2026 6:04 PM
> >
> > There has been a longstanding MMIO conflict between the pci_hyperv
> > driver's config_window (see hv_allocate_config_window()) and the
> > hyperv_drm (or hyperv_fb) driver (see hyperv_setup_vram()): typically
> > both get MMIO from the low MMIO range below 4GB; this is not an issue
> > in the normal kernel since the VMBus driver reserves the framebuffer
> > MMIO in vmbus_reserve_fb(), so the drm driver's hyperv_setup_vram()
> > can always get the reserved framebuffer MMIO; however, a Gen2 VM's
> > kdump kernel fails to reserve the framebuffer MMIO in vmbus_reserve_fb()
> > because the screen_info.lfb_base is zero in the kdump kernel:
> > the screen_info is not initialized at all in the kdump kernel, because the
> > EFI stub code, which initializes screen_info, doesn't run in the case of kdump.
>
> I don't think this is correct. Yes, the EFI stub doesn't run, but screen_info
Hi Michael, sorry for delaying the reply for so long! Now I think I should
understand all the details.
My earlier statement "the screen_info is not initialized at all in the kdump
kernel" is not correct on x86, but I believe it's correct on ARM64. Please see
my explanation below.
> should be initialized in the kdump kernel by the code that loads the
> kdump kernel into the reserved crash memory. See discussion in the commit
> message for commit 304386373007.
>
> I wonder if commit a41e0ab394e4 broke the initialization of screen_info
> in the kdump kernel. Or perhaps there is now a rev-lock between the kernel
> with this commit and a new version of the user space kexec command.
The commit
a41e0ab394e4 ("sysfb: Replace screen_info with sysfb_primary_display")
should be unrelated here.
> There's a parameter to the kexec() command that governs whether it
> uses the kexec_file_load() system call or the kexec_load() system call.
> I wonder if that parameter makes a difference in the problem described
> for this patch.
>
> I can't immediately remember if, when I was working on commit
> 304386373007, I tested kdump in a Gen 2 VM with an NVMe OS disk to
> ensure that MMIO space was properly allocated to the frame buffer
> driver (either hyperv_fb or hyperv_drm). I'm thinking I did, but tomorrow
> I'll check for any definitive notes on that.
>
> Michael
If vmbus_reserve_fb() in the kdump kernel fails to reserve the framebuffer
MMIO range due to a Gen2 VM's screen_info.lfb_base being 0, the MMIO
conflict between hyperv_fb/hyperv_drm and hv_pci happens -- this is
especially an issue if hv_pci is built-in and hyperv_fb/hyperv_drm is built
as modules. vmbus_reserve_fb() should always succeed for a Gen1 VM, since
it can always get the framebuffer MMIO base from the legacy PCI graphics
device, so we only need to discuss Gen2 VMs here.
When kdump-tools loads the kdump kernel into memory, the tool can
accept any of the 3 parameters (e.g. I got the below via "man kexec" in
Ubuntu 24.04):
-s (--kexec-file-syscall)
Specify that the new KEXEC_FILE_LOAD syscall should be used exclusively.
-c (--kexec-syscall)
Specify that the old KEXEC_LOAD syscall should be used exclusively.
-a (--kexec-syscall-auto)
Try the new KEXEC_FILE_LOAD syscall first and when it is not supported or the kernel does not understand the supplied im‐
age fall back to the old KEXEC_LOAD interface.
There is no one single interface that always works, so this is the default.
KEXEC_FILE_LOAD is required on systems that use locked-down secure boot to verify the kernel signature. KEXEC_LOAD may be
also disabled in the kernel configuration.
KEXEC_LOAD is required for some kernel image formats and on architectures that do not implement KEXEC_FILE_LOAD.
If none of the parameters are specified, the default may be -c, or -s
or -a, depending on the distro and the version in use. We can run
strace -f kdump-config reload 2>&1 | egrep 'kexec_file_load|kexec_load'
to tell which syscall is being used.
Old distro versions seem to use KEXEC_LOAD by default, and new distro
versions tend to use KEXEC_FILE_LOAD by default, especially when
Secure Boot is enabled (e.g. see /usr/sbin/kdump-config: kdump_load()
in Ubuntu).
In Ubuntu, we can explicitly specify one of the parameters in
"/etc/default/kdump-tools", e.g. KDUMP_KEXEC_ARGS="-c -d".
The -d is for debugging. I found it very useful: when we run
"kdump-config show" or "kdump-config reload", we get very useful
debug info with -d.
On x86-64, with -c:
The kdump-tools gets the framebuffer's MMIO base using
ioctl(fd, FBIOGET_FSCREENINFO, ....): see the end of the email for
an example program; kdump-tools then uses the KEXEC_LOAD syscall
to set up the screen_info.lfb_base for the kdump kernel.
The function in kdump-tools that gets the framebuffer MMIO base
is kexec/arch/i386/x86-linux-setup.c: setup_linux_vesafb():
https://git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git/tree/kexec/arch/i386/x86-linux-setup.c?h=v2.0.32#n133
Unluckily, setup_linux_vesafb() only recognizes the vesafb
driver in Linux kernel ("VESA VGA") and the efifb driver ("EFI VGA").
It looks like normally arch_options.reuse_video_type is always 0.
This means the kdump kernel's screen_info.lfb_base is 0, if
hyperv_fb or hyperv_drm loads. In the past, for a Ubuntu kernel
with CONFIG_FB_EFI=y, our workaround is blacklisting
hyperv_fb or hyperv_drm, so /dev/fb0 is backed by efifb, and
the screen_info.lfb_base is correctly set for kdump.
However, now CONFIG_FB_EFI is not set in recent Ubuntu kernels:
$ egrep 'CONFIG_FB_EFI|CONFIG_SYSFB|CONFIG_SYSFB_SIMPLEFB|CONFIG_DRM_SIMPLEDRM|CONFIG_DRM_HYPERV' /boot/config-6.8.0-1051-azure
CONFIG_SYSFB=y
CONFIG_SYSFB_SIMPLEFB=y
CONFIG_DRM_SIMPLEDRM=y
CONFIG_DRM_HYPERV=m
# CONFIG_FB_EFI is not set
So, with Ubuntu 22.04/24.04, -c can't avoid the MMIO conflict
for Gen2 x86-64 VMs now, even if we blacklist hyperv_fb/hyperv_drm.
Note: Ubuntu 20.04 uses an old version of the kdump-tools, so
the statement is different there (see the later discussion below).
hyperv_fb has been removed in the mainline kernel: see
commit 40227f2efcfb ("fbdev: hyperv_fb: Remove hyperv_fb driver")
so we no longer need to worry about it.
Even if we modify setup_linux_vesafb() to support hyperv_drm,
it still won't work, because the MMIO base is hidden by commit
da6c7707caf3 ("fbdev: Add FBINFO_HIDE_SMEM_START flag")
On x86-64, with -s:
The KEXEC_FILE_LOAD syscall sets the kdump kernel's
screen_info.lfb_base in the kernel: see
"arch/x86/kernel/kexec-bzimage64.c"
bzImage64_load
setup_boot_parameters
memcpy(¶ms->screen_info, &screen_info, sizeof(struct screen_info));
so, as long as the first kernel's hyperv_drm doesn't relocate the
MMIO base, kdump should work fine; if the MMIO base is relocated,
currently hyperv_drm doesn't update the screen_info.lfb_base,
so the kdump's efifb driver and hv_pci driver won't work. Normally
hyperv_drm doesn't relocate the MMIO base, unless the user
specifies a very high resolution and the required MMIO size
exceeds the default 8MB reserved by vmbus_reserve_fb() -- let's
ignore that scenario for now.
On AMR64, with -c:
The kdump-tools doesn't even open /dev/fb0 (we can confirm this by using
strace or bpftrace), so the kdump kernel's screen_info.lfb_base ia always 0.
On AMR64, with -s:
"arch/arm64/kernel/kexec_image.c": image_load() doesn't set the
params->screen_info, so the kdump kernel's screen_info.lfb_base ia always 0.
To recap, with a recent mainline kernel (or the linux-azure kernels) that
has 304386373007, my observation on Ubuntu 22.04 and 24.04 is:
on x86-64, -c fails, but -s works.
on ARM64, -c fails, and -s also fails.
Note: the kdump-tools v2.0.18 in Ubuntu 20.04 doesn't have this commit:
https://git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git/commit/?id=fb5a8792e6e4ee7de7ae3e06d193ea5beaaececc
(Note the "return 0;" in setup_linux_vesafb())
so, on x86-64, -c also works in Ubuntu 20.04, if hyperv_fb is used
(-c still doesn't work if hyperv_drm is used due to da6c7707caf3).
With this patch
"PCI: hv: Allocate MMIO from above 4GB for the config window",
both -c and -s work on x86-64 and ARM64 due to no MMIO conflict,
as long as there are no 32-bit PCI BARs (which should be true on
Azure and on modern hosts.)
With the patch, even if hyperv_drm relocates the framebuffer MMO
base, there would still be no MMIO conflict because typically hyperv_drm
gets its MMIO from below 4GB: it seems like vmbus_walk_resources()
always finds the low MMIO range first and adds it to the beginning of the
MMIO resources "hyperv_mmio", so presumably hyperv_drm would
get MMIO from the low MMIO range.
I'll update the commit message, add Matthew's and Krister's
Tested-by's and post v2.
Thanks,
Dexuan
//Print the info of the frame buffer for /dev/fb0:
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/ioctl.h>
#include <linux/fb.h>
static void print_bitfield(const char *name, const struct fb_bitfield *bf) {
printf("%s:\n", name);
printf(" offset : %u\n", bf->offset);
printf(" length : %u\n", bf->length);
printf(" msb_right : %u\n", bf->msb_right);
}
static void print_fix_screeninfo(const struct fb_fix_screeninfo *fix) {
printf("struct fb_fix_screeninfo:\n");
printf(" id : %.16s\n", fix->id);
printf(" smem_start : 0x%lx\n", fix->smem_start);
printf(" smem_len : %u\n", fix->smem_len);
printf(" type : %u\n", fix->type);
printf(" type_aux : %u\n", fix->type_aux);
printf(" visual : %u\n", fix->visual);
printf(" xpanstep : %u\n", fix->xpanstep);
printf(" ypanstep : %u\n", fix->ypanstep);
printf(" ywrapstep : %u\n", fix->ywrapstep);
printf(" line_length : %u\n", fix->line_length);
printf(" mmio_start : %lu\n", fix->mmio_start);
printf(" mmio_len : %u\n", fix->mmio_len);
printf(" accel : %u\n", fix->accel);
printf(" capabilities : %u\n", fix->capabilities);
printf(" reserved[0] : %u\n", fix->reserved[0]);
printf(" reserved[1] : %u\n", fix->reserved[1]);
}
static void print_var_screeninfo(const struct fb_var_screeninfo *var) {
printf("struct fb_var_screeninfo:\n");
printf(" xres : %u\n", var->xres);
printf(" yres : %u\n", var->yres);
printf(" xres_virtual : %u\n", var->xres_virtual);
printf(" yres_virtual : %u\n", var->yres_virtual);
printf(" xoffset : %u\n", var->xoffset);
printf(" yoffset : %u\n", var->yoffset);
printf(" bits_per_pixel: %u\n", var->bits_per_pixel);
printf(" grayscale : %u\n", var->grayscale);
print_bitfield(" red", &var->red);
print_bitfield(" green", &var->green);
print_bitfield(" blue", &var->blue);
print_bitfield(" transp", &var->transp);
printf(" nonstd : %u\n", var->nonstd);
printf(" activate : %u\n", var->activate);
printf(" height : %u\n", var->height);
printf(" width : %u\n", var->width);
printf(" accel_flags : %u\n", var->accel_flags);
printf(" pixclock : %u\n", var->pixclock);
printf(" left_margin : %u\n", var->left_margin);
printf(" right_margin : %u\n", var->right_margin);
printf(" upper_margin : %u\n", var->upper_margin);
printf(" lower_margin : %u\n", var->lower_margin);
printf(" hsync_len : %u\n", var->hsync_len);
printf(" vsync_len : %u\n", var->vsync_len);
printf(" sync : %u\n", var->sync);
printf(" vmode : %u\n", var->vmode);
printf(" rotate : %u\n", var->rotate);
printf(" colorspace : %u\n", var->colorspace);
printf(" reserved[0] : %u\n", var->reserved[0]);
printf(" reserved[1] : %u\n", var->reserved[1]);
printf(" reserved[2] : %u\n", var->reserved[2]);
printf(" reserved[3] : %u\n", var->reserved[3]);
}
int main(void) {
int fd;
struct fb_fix_screeninfo fix;
struct fb_var_screeninfo var;
fd = open("/dev/fb0", O_RDONLY);
if (fd == -1) {
perror("open");
return EXIT_FAILURE;
}
if (ioctl(fd, FBIOGET_FSCREENINFO, &fix) == -1) {
perror("ioctl(FBIOGET_FSCREENINFO)");
close(fd);
return EXIT_FAILURE;
}
if (ioctl(fd, FBIOGET_VSCREENINFO, &var) == -1) {
perror("ioctl(FBIOGET_VSCREENINFO)");
close(fd);
return EXIT_FAILURE;
}
print_fix_screeninfo(&fix);
printf("\n");
print_var_screeninfo(&var);
close(fd);
return EXIT_SUCCESS;
}