Re: ipmi_si fails to get BMC ID
From: Chris Chiu
Date: Tue Feb 20 2018 - 12:53:04 EST
On Thu, Feb 15, 2018 at 1:17 AM, Corey Minyard <minyard@xxxxxxx> wrote:
> I'm removing Greg and Arnd from the email, I don't think this requires their
> participation.
>
>
>
>
> On 02/13/2018 08:44 PM, Chris Chiu wrote:
>>
>> On Fri, Feb 9, 2018 at 9:34 PM, Corey Minyard <minyard@xxxxxxx> wrote:
>>>
>>> On 02/08/2018 09:09 PM, Chris Chiu wrote:
>>>>
>>>> On Thu, Feb 8, 2018 at 11:53 PM, Corey Minyard <minyard@xxxxxxx> wrote:
>>>>>
>>>>> On 02/07/2018 09:01 PM, Chris Chiu wrote:
>>>>>>
>>>>>> Hi,
>>>>>> We are working with a new desktop Acer Veriton Z4640G and get
>>>>>> stumbled on failing to enter S3 suspend with kernel version 4.14 even
>>>>>> the latest 4.15+. Here's the kernel log
>>>>>> https://gist.github.com/mschiu77/76888f1fd4eb56aa8959d76759a912bb.
>>>>>
>>>>>
>>>>> This is a little strange, nobody had reported this before. Can you
>>>>> reproduce this
>>>>> at will, or was it a one-time thing?
>>>>
>>>> It can be reproduced on each reboot.
>>>>>
>>>>> Does the IPMI driver always take this long to issue that error, even if
>>>>> you
>>>>> are not
>>>>> entering sleep state?
>>>>>
>>>> Yep, it will always print "ipmi_si 0000:02:00.3: There appears to be
>>>> no BMC at this
>>>> location" few minutes after boot.
>>>>
>>>>> And it started with 4.14, and didn't occur before then, right?
>>>>>
>>>> I haven't try pre-4.14 kernel. Will do that and update here.
>>>
>>>
>>> Ah. It's probably still worth trying, but I doubt it will make any
>>> difference.
>>>
>>> Are you sure there is actually an IPMI BMC installed in this system? It
>>> might
>>> be a plug-in card that is not installed, but the interface still appears
>>> on
>>> the
>>> PCI bus. So there is enough hardware to go part-way through the motions
>>> of being an IPMI interface, but not enough to actually work.
>>>
>>> If there is a BMC there, do you know the register layout? The IPMI spec
>>> has
>>> an algorithm to go through to discover some of the parameters, and the
>>> driver follows it, but IMHO it's not really very good. I'll need to know
>>> the
>>> size of the registers, and the spacing between the registers.
>>>
>>> -corey
>>>
>>>
>> Sorry for late response because it's close to Chinese New Year.
>> I can get the IPMI working with the driver here on Windows.
>>
>> https://www.drivermax.com/Realtek-Virtual-IPMI-Realtek-PCI-VEN-10EC-DEV-816C-1_0_0718_2013-2013-07-18-509795-driver.htm
>> Then you will see the device (hightlighted) on the control panel as
>> follows
>> https://pasteboard.co/H7xm3fJ.png
>
>
> Hmm, Windows has a built-in IPMI driver. I wonder why the aren't using
> that.
> And I don't know what "Virtual IPMI" means.
>
>> I don't know how to get the register layout you need. I can only take a
>> picture of the content of the PCI resources.
>> https://pasteboard.co/H7xnhz0.png
>>
>> The contents of BAR1, BAR3, BAR5 are all 0xff. Can you point me out
>> where the useful information might be and I can try to dump FYI.
>
>
> This whole thing sounds like they have created a non-standard IPMI interface
> and put it on the PCI bus like a standard one. Unless we can get
> documentation
> for this, there's not much I can do but blacklist it.
>
> Do you have any ties with Realtek? I can't find anything on their web site
> related
> to IPMI.
>
> -corey
>
>
I see. I'll try to reach the contact window for this in Realtek to see
if there's any
datasheet or document and get back here.
Chris
>>
>> Chris
>>
>>>>> There's a bug in the PCI utils database, I submitted a report a while
>>>>> ago.
>>>>> This is
>>>>> a KCS, not a SMIC interface.
>>>>>
>>>>> It looks like the driver is trying to detect that there is a device out
>>>>> there and
>>>>> there is something that kind of works, but doesn't work completely. The
>>>>> interface
>>>>> specific code was all split out into separate files in 4.14. It is
>>>>> possible
>>>>> the
>>>>> detection code got messed up in the process. Nothing jumps out looking
>>>>> at
>>>>> the code differences, and I know it works on some PCI machines.
>>>>>
>>>>> Assuming this is reproducible, can you send the the output of a
>>>>> pre-4.14
>>>>> kernel? If that doesn't make it obvious I may have to have access to
>>>>> the
>>>>> machine itself.
>>>>>
>>>>> -corey
>>>>>
>>>>>
>>>> It's an All-in-One machine so I think it would be difficult for
>>>> shipment. I'll see what
>>>> I can do. Thanks for help.
>>>>
>>>> Chris
>>>>
>>>>>> As you see, it is due to "ipmi_probe+0x430/0x430 [ipmi_si]".
>>>>>> After
>>>>>> the message "ipmi_si 0000:02:00.3: There appears to be no BMC at this
>>>>>> location" shows up, then it can really go to suspend w/o problem.
>>>>>> Although it took around 3 mins. The IPMI device is probed from PCI and
>>>>>> here's the output of lspci
>>>>>> https://gist.github.com/mschiu77/33f0372be41670d8a69c97e64f833087. The
>>>>>> IPMI device is "02:00.3 IPMI SMIC interface [0c07]". We get stuck here
>>>>>> because we don't really know why it took so long in try_get_dev_id() /
>>>>>> ipmi_si_intf.c. Any suggestion about this to help us moving forward?
>>>>>> Thanks
>>>>>>
>>>>>>
>>>>>> Chris
>>>>>
>>>>>
>>>>>
>