RE: [char-misc-next] mei: use kvmalloc for read buffer

From: Usyskin, Alexander
Date: Mon Oct 14 2024 - 10:44:00 EST



> -----Original Message-----
> From: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> Sent: Monday, October 14, 2024 4:25 PM
> To: Usyskin, Alexander <alexander.usyskin@xxxxxxxxx>
> Cc: Weil, Oren jer <oren.jer.weil@xxxxxxxxx>; Tomas Winkler
> <tomasw@xxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx
> Subject: Re: [char-misc-next] mei: use kvmalloc for read buffer
>
> On Mon, Oct 14, 2024 at 01:15:49PM +0000, Usyskin, Alexander wrote:
> > > -----Original Message-----
> > > From: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> > > Sent: Sunday, October 13, 2024 6:08 PM
> > > To: Usyskin, Alexander <alexander.usyskin@xxxxxxxxx>
> > > Cc: Weil, Oren jer <oren.jer.weil@xxxxxxxxx>; Tomas Winkler
> > > <tomasw@xxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx
> > > Subject: Re: [char-misc-next] mei: use kvmalloc for read buffer
> > >
> > > On Sun, Oct 13, 2024 at 02:22:27PM +0000, Usyskin, Alexander wrote:
> > > > > -----Original Message-----
> > > > > From: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> > > > > Sent: Sunday, October 13, 2024 3:14 PM
> > > > > To: Usyskin, Alexander <alexander.usyskin@xxxxxxxxx>
> > > > > Cc: Weil, Oren jer <oren.jer.weil@xxxxxxxxx>; Tomas Winkler
> > > > > <tomasw@xxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx
> > > > > Subject: Re: [char-misc-next] mei: use kvmalloc for read buffer
> > > > >
> > > > > On Sun, Oct 13, 2024 at 02:53:14PM +0300, Alexander Usyskin wrote:
> > > > > > Read buffer is allocated according to max message size,
> > > > > > reported by the firmware and may reach 64K in systems
> > > > > > with pxp client.
> > > > > > Contiguous 64k allocation may fail under memory pressure.
> > > > > > Read buffer is used as in-driver message storage and
> > > > > > not required to be contiguous.
> > > > > > Use kvmalloc to allow kernel to allocate non-contiguous
> > > > > > memory in this case.
> > > > > >
> > > > > > Signed-off-by: Alexander Usyskin <alexander.usyskin@xxxxxxxxx>
> > > > > > ---
> > > > > > drivers/misc/mei/client.c | 4 ++--
> > > > > > 1 file changed, 2 insertions(+), 2 deletions(-)
> > > > >
> > > > > What about this thread:
> > > > > https://lore.kernel.org/all/20240813084542.2921300-1-
> > > > > rohiagar@xxxxxxxxxxxx/
> >
> > [1] https://lore.kernel.org/all/20240813084542.2921300-1-
> rohiagar@xxxxxxxxxxxx/
>
> Yes, it's a problem, I don't understand.
>
> > > > >
> > > > > No attribution for the reporter? Does it solve their problem?
> > > > >
> > > > This patch is a result from non-public bug report on ChromeOS.
> > >
> > > Then make that bug report public as it was discussed in public already :)
> > >
> > Unfortunately, it is not my call.
> > For now, I'll anchor this on [1]
> >
> > > > > Also, where is this memory pressure coming from, what is the root
> cause
> > > > > and what commit does this fix? Stable backports needed? Anything
> else?
> > > > >
> > > > The ChromeOS is extremely short on memory by design and can trigger
> > > > this situation very easily.
> > >
> > > So normal allocations are failing? That feels wrong, what caused this?
> >
> > 64K is order 4 allocation and may fail according to [1].
>
> And what changed to cause this to suddenly be 64k? And why can't we
> allocate 64k at this point in time now?
>
> > > > I do not think that this patch fixes any commit - the problematic code
> exists
> > > > from the earliest versions of this driver.
> > > > As this problem reproduced only on ChromeOS I believe that no need
> > > > in wide backport, the ChromeOS can cherry-pick the patch.
> > > > From your experience, is this the right strategy?
> > >
> > > No.
> >
> > Sure, I'll use
> > Fixes: 3030dc056459 ("mei: add wrapper for queuing control commands.")
> > where the first time such buffer allocated and add stable here in v2.
>
> So the problem has been there for years? Why is it just now showing up?
>

I suppose it is the combination of some fairly new FW that requests 64K buffer
for content-protection case, underpowered ChromeBook and ChromeOS running
content-protection flow.
All three conditions should be met to trigger this failure.

> thanks,
>
> greg k-h

- -
Thanks,
Sasha