Re: Habanalabs Open-Source TPC LLVM compiler and SynapseAI Core library

From: Daniel Vetter
Date: Thu Oct 28 2021 - 03:38:31 EST


On Wed, Oct 27, 2021 at 8:53 AM Oded Gabbay <ogabbay@xxxxxxxxxx> wrote:
>
> On Fri, Sep 10, 2021 at 10:58 AM Greg Kroah-Hartman
> <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> >
> > On Fri, Sep 10, 2021 at 10:26:56AM +0300, Oded Gabbay wrote:
> > > Hi Greg,
> > >
> > > Following our conversations a couple of months ago, I'm happy to tell you that
> > > Habanalabs has open-sourced its TPC (Tensor Processing Core) LLVM compiler,
> > > which is a fork of the LLVM open-source project.
> > >
> > > The project can be found on Habanalabs GitHub website at:
> > > https://github.com/HabanaAI/tpc_llvm
> > >
> > > There is a companion guide on how to write TPC kernels at:
> > > https://docs.habana.ai/en/latest/TPC_User_Guide/TPC_User_Guide.html
> >
> > That's great news, thanks for pushing for this and releasing it all!
> >
> > greg k-h
>
> Hi Greg,
> I would like to update that yesterday AWS launched new EC2 instances
> powered by the Gaudi accelerators. It is now in general availability,
> and anyone can launch an instance with those devices.
> Therefore, one can now take the upstream driver, hl-thunk, tpc llvm
> compiler and SynapseAI core and execute compute kernels on the Gaudi
> devices. I have verified this to be working with the driver in kernel
> 5.15-rc6.

Nice!

Now that the llvm part is open, any plans to upstream that? Years ago
when amd upstreamed their backend there was the hope that llvm would
grow some competent support for gpu style accelerator isa, but since
for years now amd's the only backend that ever was merged it's stuck
in a chicken-egg situation of upstream llvm complaining why amd
backend has all these special requirements. And other accel backends
(at least the gpu-style simd ones) not having a good path to upstream
llvm since a lot of the infrastructure and understanding isn't there.

Getting a 2nd accel backend into upstream llvm would be a huge step
towards fixing this mess. As far as I know the only other open accel
backend based on llvm is intel's igc (for intel gpus), and that one is
such a massive fork that's been out of upstream llvm for so long that
it's not going to land anytime soon, if ever (in it's current form at
least).

Once we do have an accel backend in upstream llvm we can finally start
building a real stack here I think, so whomever is first will win
quite some advantage I think.

Cheers, Daniel

> We are still missing the networking parts, but I hope to start
> upstreaming them in the next coming months.
>
> Thanks,
> Oded



--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch