Re: [PATCH 04/12] habanalabs: add unsupported functions

From: Greg KH
Date: Tue Jun 28 2022 - 05:12:54 EST


On Tue, Jun 28, 2022 at 11:21:24AM +0300, Oded Gabbay wrote:
> On Tue, Jun 28, 2022 at 9:34 AM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> >
> > On Mon, Jun 27, 2022 at 11:26:12PM +0300, Oded Gabbay wrote:
> > > There are a number of new ASIC-specific functions that were added
> > > for Gaudi2. To make the common code work, we need to define empty
> > > implementations of those functions for Goya and Gaudi.
> > >
> > > Some functions will return error if called with Goya/Gaudi.
> > >
> > > Signed-off-by: Oded Gabbay <ogabbay@xxxxxxxxxx>
> > > ---
> > > drivers/misc/habanalabs/gaudi/gaudi.c | 24 ++++++++++++++++++++++++
> > > drivers/misc/habanalabs/goya/goya.c | 24 ++++++++++++++++++++++++
> > > 2 files changed, 48 insertions(+)
> > >
> > > diff --git a/drivers/misc/habanalabs/gaudi/gaudi.c b/drivers/misc/habanalabs/gaudi/gaudi.c
> > > index ae894335e9f8..f4581220ecd5 100644
> > > --- a/drivers/misc/habanalabs/gaudi/gaudi.c
> > > +++ b/drivers/misc/habanalabs/gaudi/gaudi.c
> > > @@ -8588,6 +8588,11 @@ static void gaudi_ctx_fini(struct hl_ctx *ctx)
> > > gaudi_internal_cb_pool_fini(ctx->hdev, ctx);
> > > }
> > >
> > > +int gaudi_pre_schedule_cs(struct hl_cs *cs)
> > > +{
> > > + return 0;
> > > +}
> > > +
> > > static u32 gaudi_get_queue_id_for_cq(struct hl_device *hdev, u32 cq_idx)
> > > {
> > > return gaudi_cq_assignment[cq_idx];
> > > @@ -8959,6 +8964,14 @@ static void gaudi_enable_events_from_fw(struct hl_device *hdev)
> > > gaudi_irq_map_table[GAUDI_EVENT_INTS_REGISTER].cpu_id);
> > > }
> > >
> > > +int gaudi_ack_mmu_page_fault_or_access_error(struct hl_device *hdev,
> > > + u64 mmu_cap_mask)
> > > +{
> > > + dev_err(hdev->dev, "mmu_error function is not supported\n");
> >
> > Can userspace trigger this? if so, make it debug, as you don't want to
> > give userspace a way to spam the logs.
>
> Only via a debugfs node which is exposed only to root user.
> What is your recommendation in that case ?

What does this error message help out with? Why not just return an
error and be done with it? No need to spam the kernel log, right?

thanks,

greg k-h