Re: [PATCH] mm: be more informative in OOM task list

From: Rodrigo Freire
Date: Mon Jul 02 2018 - 07:39:13 EST


Hello Michal!

----- Original Message -----
> From: "Michal Hocko" <mhocko@xxxxxxxxxx>
> To: "Rodrigo Freire" <rfreire@xxxxxxxxxx>
> Cc: linux-mm@xxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx
> Sent: Monday, July 2, 2018 8:29:06 AM
> Subject: Re: [PATCH] mm: be more informative in OOM task list
>
> On Mon 02-07-18 07:22:13, Rodrigo Freire wrote:
> > Hello Michal,
> >
> > ----- Original Message -----
> > > From: "Michal Hocko" <mhocko@xxxxxxxxxx>
> > > To: "Rodrigo Freire" <rfreire@xxxxxxxxxx>
> > > Cc: linux-mm@xxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx
> > > Sent: Monday, July 2, 2018 6:30:43 AM
> > > Subject: Re: [PATCH] mm: be more informative in OOM task list
> > >
> > > On Sun 01-07-18 13:09:40, Rodrigo Freire wrote:
> > > > The default page memory unit of OOM task dump events might not be
> > > > intuitive for the non-initiated when debugging OOM events. Add
> > > > a small printk prior to the task dump informing that the memory
> > > > units are actually memory _pages_.
> > >
> > > Does this really help? I understand the the oom report might be not the
> > > easiest thing to grasp but wouldn't it be much better to actually add
> > > documentation with clarification of each part of it?
> >
> > That would be great: After a quick grep -ri for oom in Documentation,
> > I found several other files containing its own OOM behaviour modifier
> > configurations. But it indeed lacks a central and canonical Doc file
> > which documents the OOM Killer behavior and workflows.
> >
> > However, I still stand by my proposed patch: It is unobtrusive, infers
> > no performance issue and clarifying: I recently worked in a case (for
> > full disclosure: I am a far cry from a MM expert) where the sum of the
> > RSS pages made sense when interpreted as real kB pages. Reason: There
> > were processes sharing (a good amount of) memory regions, misleading
> > the interpretation and that misled not only me, but some other
> > colleagues a well: The pages was only sorted out after actually
> > inspecting the source code.
> >
> > This patch is user-friendly and can be a great time saver to others in
> > the community.
>
> Well, all other counters we print are in page units unless explicitly
> kB.

Your statement is correct. And I thought about that too. And then the doubt:
* Maybe someone forgot to state that these values are in kB?

> So I am not sure we really need to do anything but document the
> output better. Maybe others will find it more important though.

The thing is, it also led some other colleagues (a few!) to think the
very same as me: That raised the flag and made me write the patch:
That was indeed misleading.
And you may not have a MM and OOM-versed specialist available all the
time! ;-)

Still ask you to reconsider.

My best regards,

- RF.