Re: [PATCH] habanalabs: improve error messages

From: Greg KH
Date: Fri Mar 29 2019 - 14:50:14 EST


On Fri, Mar 29, 2019 at 08:29:08PM +0300, Oded Gabbay wrote:
> On Fri, Mar 29, 2019 at 7:29 PM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> >
> > On Thu, Mar 28, 2019 at 09:13:13AM +0200, Oded Gabbay wrote:
> > > This patch improves two error messages to help the user to
> > > better understand what error occurred.
> > >
> > > Signed-off-by: Oded Gabbay <oded.gabbay@xxxxxxxxx>
> > > ---
> > > drivers/misc/habanalabs/command_submission.c | 3 ++-
> > > drivers/misc/habanalabs/memory.c | 2 +-
> > > 2 files changed, 3 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/misc/habanalabs/command_submission.c b/drivers/misc/habanalabs/command_submission.c
> > > index f908643f871f..02c48da0b645 100644
> > > --- a/drivers/misc/habanalabs/command_submission.c
> > > +++ b/drivers/misc/habanalabs/command_submission.c
> > > @@ -261,7 +261,8 @@ static void cs_timedout(struct work_struct *work)
> > > ctx_asid = cs->ctx->asid;
> > >
> > > /* TODO: add information about last signaled seq and last emitted seq */
> > > - dev_err(hdev->dev, "CS %d.%llu got stuck!\n", ctx_asid, cs->sequence);
> > > + dev_err(hdev->dev, "User %d command submission %llu got stuck!\n",
> > > + ctx_asid, cs->sequence);
> > >
> > > cs_put(cs);
> > >
> > > diff --git a/drivers/misc/habanalabs/memory.c b/drivers/misc/habanalabs/memory.c
> > > index ce1fda40a8b8..39788b1cf8d0 100644
> > > --- a/drivers/misc/habanalabs/memory.c
> > > +++ b/drivers/misc/habanalabs/memory.c
> > > @@ -109,7 +109,7 @@ static int alloc_device_memory(struct hl_ctx *ctx, struct hl_mem_in *args,
> > > page_size);
> > > if (!phys_pg_pack->pages[i]) {
> > > dev_err(hdev->dev,
> > > - "ioctl failed to allocate page\n");
> > > + "Failed to allocate device memory (out of memory)\n");
> >
> > No need for a message at all here, right? The core should have already
> > told you you had a problem.
> >
> > greg k-h
>
> No, I don't think so, because this function is called for allocating
> memory in the device's DRAM. So we don't pass through the Linux core
> code inside.
> We use the generic genalloc module to implement the device's DRAM
> physical page allocator, and AFAIK, you won't get any message in case
> genalloc fails to find a free memory in its pool.
> Even if genalloc prints something, I would prefer to display a more
> meaningful message to the user in this case. This allocation is
> directly requested by the user (its part of the IOCTL code) and he
> should know how it failed, no ?

Ok, that sounds reasonable, nevermind :)

greg k-h