Re: [PATCH v3 3/3] vfio/nvgrace-gpu: Check the HBM training and C2C link status
From: Ankit Agrawal
Date: Fri Jan 17 2025 - 16:14:05 EST
>> > We're accessing device memory here but afaict the memory enable bit of
>> > the command register is in an indeterminate state. What happens if you
>> > use setpci to clear the memory enable bit or 'echo 0 > enable' before
>> > binding the driver? Thanks,
>> >
>> > Alex
>>
>> Hi Alex, sorry I didn't understand how we are accessing device memory here if
>> the C2C_LINK_BAR0_OFFSET and HBM_TRAINING_BAR0_OFFSET are BAR0 regs.
>> But anyways, I tried 'echo 0 > <sysfs_path>/enable' before device bind. I am not
>> observing any issue and the bind goes through.
>>
>> Or am I missing something?
>
> BAR0 is what I'm referring to as device memory. We cannot access
> registers in BAR0 unless the memory space enable bit of the command
> register is set. The nvgrace-gpu driver makes no effort to enable this
> and I don't think the PCI core does before probe either. Disabling
> through sysfs will only disable if it was previously enabled, so
> possibly that test was invalid. Please try with setpci:
>
> # Read command register
> $ setpci -s xxxx:xx:xx.x COMMAND
> # Clear memory enable
> $ setpci -s xxxx:xx:xx.x COMMAND=0:2
> # Re-read command register
> $ setpci -s xxxx:xx:xx.x COMMAND
>
> Probe driver here now that the memory enable bit should re--back as
> unset. Thanks,
>
> Alex
Ok, yeah. I tried to disable through setpci, and the probe is failing with ETIME.
Should we check if disabled and return -EIO for such situation to differentiate
from timeout?