RE: Linux logs error: `mei_me 0000:00:16.0: cl:host=04 me=00 is not connected`
From: Usyskin, Alexander
Date: Mon Apr 01 2024 - 02:54:56 EST
> -----Original Message-----
> From: Paul Menzel <pmenzel@xxxxxxxxxxxxx>
> Sent: Saturday, March 30, 2024 13:56
> To: Winkler, Tomas <tomas.winkler@xxxxxxxxx>; Usyskin, Alexander
> <alexander.usyskin@xxxxxxxxx>
> Cc: LKML <linux-kernel@xxxxxxxxxxxxxxx>
> Subject: Re: Linux logs error: `mei_me 0000:00:16.0: cl:host=04 me=00 is not
> connected`
>
>
> Dear Tomas,
>
>
> Thank you for your quick response.
>
> Am 30.03.24 um 11:50 schrieb Winkler, Tomas:
> >
> >> -----Original Message-----
> >> From: Paul Menzel <pmenzel@xxxxxxxxxxxxx>
> >> Sent: Friday, March 29, 2024 12:49 PM
>
> […]
>
> >> On a Dell XPS 13 9360/0596KF, BIOS 2.21.0 06/02/2022 with Debian
> >> sid/unstable and self-built Linux 6.9-rc1+ with one patch on top [1] and
> >> KASAN enabled.
> >>
> >> $ git log --no-decorate --oneline -2 a2ce022afcbb
> >> a2ce022afcbb [PATCH] kbuild: Disable KCSAN for autogenerated *.mod.c
> intermediaries
> >> 8d025e2092e2 Merge tag 'erofs-for-6.9-rc2-fixes' of
> git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
> >>
> >> After several ACPI S3 (deep) suspend and resume cycles, this morning I
> >> noticed the error below:
> >>
> >> [29357.177635] mei_me 0000:00:16.0: cl:host=04 me=00 is not connected
> >>
> >> This seems to be logged from `mei_write()` in `drivers/misc/mei/main.c`.
> >>
> >> if (!mei_cl_is_connected(cl)) {
> >> cl_err(dev, cl, "is not connected");
> >> rets = -ENODEV;
> >> goto out;
> >> }
> >>
> >> with `drivers/misc/mei/client.h` containing:
> >>
> >> /**
> >> * mei_cl_is_connected - host client is connected
> >> *
> >> * @cl: host client
> >> *
> >> * Return: true if the host client is connected
> >> */
> >> static inline bool mei_cl_is_connected(const struct mei_cl *cl)
> >> {
> >> return cl->state == MEI_FILE_CONNECTED;
> >> }
> >>
> >> Unfortunately, I do not know at all, why the ME needs to be written to, and
> >> what was tried to be written, and what the effect of this failure is.
> >>
> >> Could you please take a look at it?
> >
> > Looks like a timing issue between setting up HDCP by graphics and
> > device power management. I don't think this is a really an issue if
> > this is happening during power cycles stress.
>
> Understood. Could this be because of the Address Sanitizer (KASAN)?
>
> > Anyway we will look at that, will you be able to provide more debug
> > information if we ask for it?
> Thank you. Yes, I can test patches. But right now, I was only able to
> see this once, so I am not sure how to reproduce it.
>
This print is in the code path executed from user-space only.
Seem like some user space app have had connection opened before suspend
and tried to write after resume, but driver closed all connections on suspend.
This is normal flow; user space should reopen handle and retry in this case.
The print can be demoted to debug, I think.
--
Alexander (Sasha) Usyskin
CSE FW Dev - Host SW
Intel Israel (74) Limited
>
> Kind regards,
>
> Paul