Re: [PATCH 09/15] habanalabs: add sysfs and hwmon support

From: Mike Rapoport
Date: Fri Jan 25 2019 - 02:54:24 EST


On Wed, Jan 23, 2019 at 02:00:51AM +0200, Oded Gabbay wrote:
> This patch add the sysfs and hwmon entries that are exposed by the driver.
>
> Goya has several sensors, from various categories such as temperature,
> voltage, current, etc. The driver exposes those sensors in the standard
> hwmon mechanism.
>
> In addition, the driver exposes a couple of interfaces in sysfs, both for
> configuration and for providing status of the device or driver.
>
> The configuration attributes is for Power Management:
> - Automatic or manual
> - Frequency value when moving to high frequency mode
> - Maximum power the device is allowed to consume
>
> The rest of the attributes are read-only and provide the following
> information:
> - Versions of the various firmwares running on the device
> - Contents of the device's EEPROM
> - The device type (currently only Goya is supported)
> - PCI address of the device (to allow user-space to connect between
> /dev/hlX to PCI address)
> - Status of the device (operational, malfunction, in_reset)
> - How many processes are open on the device's file
>
> Signed-off-by: Oded Gabbay <oded.gabbay@xxxxxxxxx>
> ---
> .../ABI/testing/sysfs-driver-habanalabs | 190 ++++++
> drivers/misc/habanalabs/Makefile | 2 +-
> drivers/misc/habanalabs/device.c | 146 +++++
> drivers/misc/habanalabs/goya/Makefile | 2 +-
> drivers/misc/habanalabs/goya/goya.c | 230 +++++++
> drivers/misc/habanalabs/goya/goyaP.h | 21 +
> drivers/misc/habanalabs/goya/goya_hwmgr.c | 306 +++++++++
> drivers/misc/habanalabs/habanalabs.h | 97 +++
> drivers/misc/habanalabs/habanalabs_drv.c | 7 +
> drivers/misc/habanalabs/hwmon.c | 449 +++++++++++++
> drivers/misc/habanalabs/sysfs.c | 588 ++++++++++++++++++
> 11 files changed, 2036 insertions(+), 2 deletions(-)
> create mode 100644 Documentation/ABI/testing/sysfs-driver-habanalabs
> create mode 100644 drivers/misc/habanalabs/goya/goya_hwmgr.c
> create mode 100644 drivers/misc/habanalabs/hwmon.c
> create mode 100644 drivers/misc/habanalabs/sysfs.c
>
> diff --git a/Documentation/ABI/testing/sysfs-driver-habanalabs b/Documentation/ABI/testing/sysfs-driver-habanalabs
> new file mode 100644
> index 000000000000..19edd4da87c1
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-driver-habanalabs
> @@ -0,0 +1,190 @@
> +What: /sys/class/habanalabs/hl<n>/armcp_kernel_ver
> +Date: Jan 2019
> +KernelVersion: 5.1
> +Contact: oded.gabbay@xxxxxxxxx
> +Description: Version of the Linux kernel running on the device's CPU
> +
> +What: /sys/class/habanalabs/hl<n>/armcp_ver
> +Date: Jan 2019
> +KernelVersion: 5.1
> +Contact: oded.gabbay@xxxxxxxxx
> +Description: Version of the application running on the device's CPU
> +
> +What: /sys/class/habanalabs/hl<n>/cpld_ver
> +Date: Jan 2019
> +KernelVersion: 5.1
> +Contact: oded.gabbay@xxxxxxxxx
> +Description: Version of the Device's CPLD F/W
> +
> +What: /sys/class/habanalabs/hl<n>/device_type
> +Date: Jan 2019
> +KernelVersion: 5.1
> +Contact: oded.gabbay@xxxxxxxxx
> +Description: Displays the code name of the device according to its type.
> + The supported values are: "GOYA"
> +
> +What: /sys/class/habanalabs/hl<n>/eeprom
> +Date: Jan 2019
> +KernelVersion: 5.1
> +Contact: oded.gabbay@xxxxxxxxx
> +Description: A binary file attribute that contains the contents of the
> + on-board EEPROM
> +
> +What: /sys/class/habanalabs/hl<n>/fuse_ver
> +Date: Jan 2019
> +KernelVersion: 5.1
> +Contact: oded.gabbay@xxxxxxxxx
> +Description: Displays the device's version from the eFuse
> +
> +What: /sys/class/habanalabs/hl<n>/hard_reset
> +Date: Jan 2019
> +KernelVersion: 5.1
> +Contact: oded.gabbay@xxxxxxxxx
> +Description: Interface to trigger a hard-reset operation for the device.
> + Hard-reset will reset ALL internal components of the device
> + except for the PCI interface and the internal PLLs
> +
> +What: /sys/class/habanalabs/hl<n>/hard_reset_cnt
> +Date: Jan 2019
> +KernelVersion: 5.1
> +Contact: oded.gabbay@xxxxxxxxx
> +Description: Displays how many times the device have undergone a hard-reset
> + operation
> +
> +What: /sys/class/habanalabs/hl<n>/high_pll
> +Date: Jan 2019
> +KernelVersion: 5.1
> +Contact: oded.gabbay@xxxxxxxxx
> +Description: Allows the user to set the maximum clock frequency for MME, TPC
> + and IC when the power management profile is set to "automatic".
> +
> +What: /sys/class/habanalabs/hl<n>/ic_clk
> +Date: Jan 2019
> +KernelVersion: 5.1
> +Contact: oded.gabbay@xxxxxxxxx
> +Description: Allows the user to set the maximum clock frequency of the
> + Interconnect fabric. Writes to this parameter affect the device
> + only when the power management profile is set to "manual" mode.
> + The device IC clock might be set to lower value then the
> + maximum. The user should read the ic_clk_curr to see the actual
> + frequency value of the IC
> +
> +What: /sys/class/habanalabs/hl<n>/ic_clk_curr
> +Date: Jan 2019
> +KernelVersion: 5.1
> +Contact: oded.gabbay@xxxxxxxxx
> +Description: Displays the current clock frequency of the Interconnect fabric
> +
> +What: /sys/class/habanalabs/hl<n>/infineon_ver
> +Date: Jan 2019
> +KernelVersion: 5.1
> +Contact: oded.gabbay@xxxxxxxxx
> +Description: Version of the Device's power supply F/W code
> +
> +What: /sys/class/habanalabs/hl<n>/max_power
> +Date: Jan 2019
> +KernelVersion: 5.1
> +Contact: oded.gabbay@xxxxxxxxx
> +Description: Allows the user to set the maximum power consumption of the
> + device in milliwatts.
> +
> +What: /sys/class/habanalabs/hl<n>/mme_clk
> +Date: Jan 2019
> +KernelVersion: 5.1
> +Contact: oded.gabbay@xxxxxxxxx
> +Description: Allows the user to set the maximum clock frequency of the
> + MME compute engine. Writes to this parameter affect the device
> + only when the power management profile is set to "manual" mode.
> + The device MME clock might be set to lower value then the
> + maximum. The user should read the mme_clk_curr to see the actual
> + frequency value of the MME
> +
> +What: /sys/class/habanalabs/hl<n>/mme_clk_curr
> +Date: Jan 2019
> +KernelVersion: 5.1
> +Contact: oded.gabbay@xxxxxxxxx
> +Description: Displays the current clock frequency of the MME compute engine
> +
> +What: /sys/class/habanalabs/hl<n>/pci_addr
> +Date: Jan 2019
> +KernelVersion: 5.1
> +Contact: oded.gabbay@xxxxxxxxx
> +Description: Displays the PCI address of the device. This is needed so the
> + user would be able to open a device based on its PCI address
> +
> +What: /sys/class/habanalabs/hl<n>/pm_mng_profile
> +Date: Jan 2019
> +KernelVersion: 5.1
> +Contact: oded.gabbay@xxxxxxxxx
> +Description: Power management profile. Values are "auto", "manual". In "auto"
> + mode, the driver will set the maximum clock frequency to a high
> + value when a user-space process opens the device's file (unless
> + it was already opened by another process). The driver will set
> + the max clock frequency to a low value when there are no user
> + processes that are opened on the device's file. In "manual"
> + mode, the user sets the maximum clock frequency by writing to
> + ic_clk, mme_clk and tpc_clk
> +
> +
> +What: /sys/class/habanalabs/hl<n>/preboot_btl_ver
> +Date: Jan 2019
> +KernelVersion: 5.1
> +Contact: oded.gabbay@xxxxxxxxx
> +Description: Version of the device's preboot F/W code
> +
> +What: /sys/class/habanalabs/hl<n>/soft_reset
> +Date: Jan 2019
> +KernelVersion: 5.1
> +Contact: oded.gabbay@xxxxxxxxx
> +Description: Interface to trigger a soft-reset operation for the device.
> + Soft-reset will reset only the compute and DMA engines of the
> + device
> +
> +What: /sys/class/habanalabs/hl<n>/soft_reset_cnt
> +Date: Jan 2019
> +KernelVersion: 5.1
> +Contact: oded.gabbay@xxxxxxxxx
> +Description: Displays how many times the device have undergone a soft-reset
> + operation
> +
> +What: /sys/class/habanalabs/hl<n>/status
> +Date: Jan 2019
> +KernelVersion: 5.1
> +Contact: oded.gabbay@xxxxxxxxx
> +Description: Status of the card: "Operational", "Malfunction", "In reset".
> +
> +What: /sys/class/habanalabs/hl<n>/thermal_ver
> +Date: Jan 2019
> +KernelVersion: 5.1
> +Contact: oded.gabbay@xxxxxxxxx
> +Description: Version of the Device's thermal daemon
> +
> +What: /sys/class/habanalabs/hl<n>/tpc_clk
> +Date: Jan 2019
> +KernelVersion: 5.1
> +Contact: oded.gabbay@xxxxxxxxx
> +Description: Allows the user to set the maximum clock frequency of the
> + TPC compute engines. Writes to this parameter affect the device
> + only when the power management profile is set to "manual" mode.
> + The device TPC clock might be set to lower value then the
> + maximum. The user should read the tpc_clk_curr to see the actual
> + frequency value of the TPC
> +
> +What: /sys/class/habanalabs/hl<n>/tpc_clk_curr
> +Date: Jan 2019
> +KernelVersion: 5.1
> +Contact: oded.gabbay@xxxxxxxxx
> +Description: Displays the current clock frequency of the TPC compute engines
> +
> +What: /sys/class/habanalabs/hl<n>/uboot_ver
> +Date: Jan 2019
> +KernelVersion: 5.1
> +Contact: oded.gabbay@xxxxxxxxx
> +Description: Version of the u-boot running on the device's CPU
> +
> +What: /sys/class/habanalabs/hl<n>/write_open_cnt
> +Date: Jan 2019
> +KernelVersion: 5.1
> +Contact: oded.gabbay@xxxxxxxxx
> +Description: Displays the total number of user processes that are currently
> + opened on the device's file
> diff --git a/drivers/misc/habanalabs/Makefile b/drivers/misc/habanalabs/Makefile
> index c07f3ccb57dc..b5607233d216 100644
> --- a/drivers/misc/habanalabs/Makefile
> +++ b/drivers/misc/habanalabs/Makefile
> @@ -5,7 +5,7 @@
> obj-m := habanalabs.o
>
> habanalabs-y := habanalabs_drv.o device.o context.o asid.o habanalabs_ioctl.o \
> - command_buffer.o hw_queue.o irq.o
> + command_buffer.o hw_queue.o irq.o sysfs.o hwmon.o
>
> include $(src)/goya/Makefile
> habanalabs-y += $(HL_GOYA_FILES)
> diff --git a/drivers/misc/habanalabs/device.c b/drivers/misc/habanalabs/device.c
> index 9199e070e79e..ff7b610f18c4 100644
> --- a/drivers/misc/habanalabs/device.c
> +++ b/drivers/misc/habanalabs/device.c
> @@ -226,6 +226,118 @@ static void device_early_fini(struct hl_device *hdev)
> mutex_destroy(&hdev->device_open);
> }
>
> +static void set_freq_to_low_job(struct work_struct *work)
> +{
> + struct hl_device *hdev = container_of(work, struct hl_device,
> + work_freq.work);
> +
> + if (atomic_read(&hdev->fd_open_cnt) == 0)
> + hl_device_set_frequency(hdev, PLL_LOW);
> +
> + schedule_delayed_work(&hdev->work_freq,
> + usecs_to_jiffies(HL_PLL_LOW_JOB_FREQ_USEC));
> +}
> +
> +/**
> + * device_late_init - do late stuff initialization for the habanalabs device
> + *
> + * @hdev: pointer to habanalabs device structure
> + *
> + * Do stuff that either needs the device H/W queues to be active or needs
> + * to happen after all the rest of the initialization is finished
> + */
> +static int device_late_init(struct hl_device *hdev)
> +{
> + int rc;
> +
> + INIT_DELAYED_WORK(&hdev->work_freq, set_freq_to_low_job);
> + hdev->high_pll = hdev->asic_prop.high_pll;
> +
> + /* force setting to low frequency */
> + atomic_set(&hdev->curr_pll_profile, PLL_LOW);
> +
> + if (hdev->pm_mng_profile == PM_AUTO)
> + hdev->asic_funcs->set_pll_profile(hdev, PLL_LOW);
> + else
> + hdev->asic_funcs->set_pll_profile(hdev, PLL_LAST);
> +
> + if (hdev->asic_funcs->late_init) {
> + rc = hdev->asic_funcs->late_init(hdev);
> + if (rc) {
> + dev_err(hdev->dev,
> + "failed late initialization for the H/W\n");
> + return rc;
> + }
> + }
> +
> + schedule_delayed_work(&hdev->work_freq,
> + usecs_to_jiffies(HL_PLL_LOW_JOB_FREQ_USEC));
> +
> + hdev->late_init_done = true;
> +
> + return 0;
> +}
> +
> +/**
> + * device_late_fini - finalize all that was done in device_late_init
> + *
> + * @hdev: pointer to habanalabs device structure
> + *
> + */
> +static void device_late_fini(struct hl_device *hdev)
> +{
> + if (!hdev->late_init_done)
> + return;
> +
> + cancel_delayed_work_sync(&hdev->work_freq);
> +
> + if (hdev->asic_funcs->late_fini)
> + hdev->asic_funcs->late_fini(hdev);
> +
> + hdev->late_init_done = false;
> +}
> +
> +/**
> + * hl_device_set_frequency - set the frequency of the device
> + *
> + * @hdev: pointer to habanalabs device structure
> + * @freq: the new frequency value
> + *
> + * Change the frequency if needed.
> + * We allose to set PLL to low only if there is no user process
> + * Returns 0 if no change was done, otherwise returns 1;
> + */
> +int hl_device_set_frequency(struct hl_device *hdev, enum hl_pll_frequency freq)
> +{
> + enum hl_pll_frequency old_freq =
> + (freq == PLL_HIGH) ? PLL_LOW : PLL_HIGH;
> + int ret;
> +
> + if (hdev->pm_mng_profile == PM_MANUAL)
> + return 0;
> +
> + ret = atomic_cmpxchg(&hdev->curr_pll_profile, old_freq, freq);
> + if (ret == freq)
> + return 0;
> +
> + /*
> + * in case we want to lower frequency, check if device is not
> + * opened. We must have a check here to workaround race condition with
> + * hl_device_open
> + */
> + if ((freq == PLL_LOW) && (atomic_read(&hdev->fd_open_cnt) > 0)) {
> + atomic_set(&hdev->curr_pll_profile, PLL_HIGH);
> + return 0;
> + }
> +
> + dev_dbg(hdev->dev, "Changing device frequency to %s\n",
> + freq == PLL_HIGH ? "high" : "low");
> +
> + hdev->asic_funcs->set_pll_profile(hdev, freq);
> +
> + return 1;
> +}
> +
> /**
> * hl_device_suspend - initiate device suspend
> *
> @@ -386,6 +498,12 @@ int hl_device_init(struct hl_device *hdev, struct class *hclass)
> goto release_ctx;
> }
>
> + rc = hl_sysfs_init(hdev);
> + if (rc) {
> + dev_err(hdev->dev, "failed to initialize sysfs\n");
> + goto free_cb_pool;
> + }
> +
> rc = hdev->asic_funcs->hw_init(hdev);
> if (rc) {
> dev_err(hdev->dev, "failed to initialize the H/W\n");
> @@ -403,11 +521,33 @@ int hl_device_init(struct hl_device *hdev, struct class *hclass)
> goto out_disabled;
> }
>
> + /* After test_queues, KMD can start sending messages to device CPU */
> +
> + rc = device_late_init(hdev);
> + if (rc) {
> + dev_err(hdev->dev, "Failed late initialization\n");
> + rc = 0;

Isn't this an error?

> + goto out_disabled;
> + }
> +
> + dev_info(hdev->dev, "Found %s device with %lluGB DRAM\n",
> + hdev->asic_name,
> + hdev->asic_prop.dram_size / 1024 / 1024 / 1024);
> +
> + rc = hl_hwmon_init(hdev);
> + if (rc) {
> + dev_err(hdev->dev, "Failed to initialize hwmon\n");
> + rc = 0;

Ditto

> + goto out_disabled;
> + }
> +
> dev_notice(hdev->dev,
> "Successfully added device to habanalabs driver\n");
>
> return 0;
>
> +free_cb_pool:
> + hl_cb_pool_fini(hdev);
> release_ctx:
> if (hl_ctx_put(hdev->kernel_ctx) != 1)
> dev_err(hdev->dev,
> @@ -457,6 +597,12 @@ void hl_device_fini(struct hl_device *hdev)
> /* Mark device as disabled */
> hdev->disabled = true;
>
> + hl_hwmon_fini(hdev);
> +
> + device_late_fini(hdev);
> +
> + hl_sysfs_fini(hdev);
> +
> /*
> * Halt the engines and disable interrupts so we won't get any more
> * completions from H/W and we won't have any accesses from the
> diff --git a/drivers/misc/habanalabs/goya/Makefile b/drivers/misc/habanalabs/goya/Makefile
> index a57096fa41b6..ada8518ec215 100644
> --- a/drivers/misc/habanalabs/goya/Makefile
> +++ b/drivers/misc/habanalabs/goya/Makefile
> @@ -1,3 +1,3 @@
> subdir-ccflags-y += -I$(src)
>
> -HL_GOYA_FILES := goya/goya.o goya/goya_security.o
> \ No newline at end of file
> +HL_GOYA_FILES := goya/goya.o goya/goya_security.o goya/goya_hwmgr.o
> \ No newline at end of file
> diff --git a/drivers/misc/habanalabs/goya/goya.c b/drivers/misc/habanalabs/goya/goya.c
> index 6c04277ae0fa..7899ff762e0b 100644
> --- a/drivers/misc/habanalabs/goya/goya.c
> +++ b/drivers/misc/habanalabs/goya/goya.c
> @@ -127,6 +127,8 @@ static const char *goya_axi_name[GOYA_MAX_INITIATORS] = {
>
> #define GOYA_ASYC_EVENT_GROUP_NON_FATAL_SIZE 121
>
> +static int goya_armcp_info_get(struct hl_device *hdev);
> +
> static void goya_get_fixed_properties(struct hl_device *hdev)
> {
> struct asic_fixed_properties *prop = &hdev->asic_prop;
> @@ -174,6 +176,7 @@ static void goya_get_fixed_properties(struct hl_device *hdev)
> prop->num_of_events = GOYA_ASYNC_EVENT_ID_SIZE;
> prop->cb_pool_cb_cnt = GOYA_CB_POOL_CB_CNT;
> prop->cb_pool_cb_size = GOYA_CB_POOL_CB_SIZE;
> + prop->max_power_default = MAX_POWER_DEFAULT;
> prop->tpc_enabled_mask = TPC_ENABLED_MASK;
>
> prop->high_pll = PLL_HIGH_DEFAULT;
> @@ -558,6 +561,89 @@ int goya_early_fini(struct hl_device *hdev)
> return 0;
> }
>
> +/**
> + * goya_fetch_psoc_frequency - Fetch PSOC frequency values
> + *
> + * @hdev: pointer to hl_device structure
> + *
> + */
> +static void goya_fetch_psoc_frequency(struct hl_device *hdev)
> +{
> + struct asic_fixed_properties *prop = &hdev->asic_prop;
> +
> + prop->psoc_pci_pll_nr = RREG32(mmPSOC_PCI_PLL_NR);
> + prop->psoc_pci_pll_nf = RREG32(mmPSOC_PCI_PLL_NF);
> + prop->psoc_pci_pll_od = RREG32(mmPSOC_PCI_PLL_OD);
> + prop->psoc_pci_pll_div_factor = RREG32(mmPSOC_PCI_PLL_DIV_FACTOR_1);
> +}
> +
> +/**
> + * goya_late_init - GOYA late initialization code
> + *
> + * @hdev: pointer to hl_device structure
> + *
> + * Get ArmCP info and send message to CPU to enable PCI access
> + */
> +static int goya_late_init(struct hl_device *hdev)
> +{
> + struct asic_fixed_properties *prop = &hdev->asic_prop;
> + struct goya_device *goya = hdev->asic_specific;
> + int rc;
> +
> + rc = goya->armcp_info_get(hdev);
> + if (rc) {
> + dev_err(hdev->dev, "Failed to get armcp info\n");
> + return rc;
> + }
> +
> + /* Now that we have the DRAM size in ASIC prop, we need to check
> + * its size and configure the DMA_IF DDR wrap protection (which is in
> + * the MMU block) accordingly. The value is the log2 of the DRAM size
> + */
> + WREG32(mmMMU_LOG2_DDR_SIZE, ilog2(prop->dram_size));
> +
> + rc = goya_send_pci_access_msg(hdev, ARMCP_PACKET_ENABLE_PCI_ACCESS);
> + if (rc) {
> + dev_err(hdev->dev, "Failed to enable PCI access from CPU\n");
> + return rc;
> + }
> +
> + WREG32(mmGIC_DISTRIBUTOR__5_GICD_SETSPI_NSR,
> + GOYA_ASYNC_EVENT_ID_INTS_REGISTER);
> +
> + goya_fetch_psoc_frequency(hdev);
> +
> + return 0;
> +}
> +
> +/**
> + * goya_late_fini - GOYA late tear-down code
> + *
> + * @hdev: pointer to hl_device structure
> + *
> + * Free sensors allocated structures
> + */
> +void goya_late_fini(struct hl_device *hdev)
> +{
> + const struct hwmon_channel_info **channel_info_arr;
> + int i = 0;
> +
> + if (!hdev->hl_chip_info.info)
> + return;
> +
> + channel_info_arr = hdev->hl_chip_info.info;
> +
> + while (channel_info_arr[i]) {
> + kfree(channel_info_arr[i]->config);
> + kfree(channel_info_arr[i]);
> + i++;
> + }
> +
> + kfree(channel_info_arr);
> +
> + hdev->hl_chip_info.info = NULL;
> +}
> +
> /**
> * goya_sw_init - Goya software initialization code
> *
> @@ -575,9 +661,15 @@ static int goya_sw_init(struct hl_device *hdev)
> return -ENOMEM;
>
> goya->test_cpu_queue = goya_test_cpu_queue;
> + goya->armcp_info_get = goya_armcp_info_get;
>
> /* according to goya_init_iatu */
> goya->ddr_bar_cur_addr = DRAM_PHYS_BASE;
> +
> + goya->mme_clk = GOYA_PLL_FREQ_LOW;
> + goya->tpc_clk = GOYA_PLL_FREQ_LOW;
> + goya->ic_clk = GOYA_PLL_FREQ_LOW;
> +
> hdev->asic_specific = goya;
>
> /* Create DMA pool for small allocations */
> @@ -4272,6 +4364,87 @@ void *goya_get_events_stat(struct hl_device *hdev, u32 *size)
> return goya->events_stat;
> }
>
> +static int goya_armcp_info_get(struct hl_device *hdev)
> +{
> + struct goya_device *goya = hdev->asic_specific;
> + struct asic_fixed_properties *prop = &hdev->asic_prop;
> + struct armcp_packet pkt;
> + void *armcp_info_cpu_addr;
> + dma_addr_t armcp_info_dma_addr;
> + u64 dram_size;
> + long result;
> + int rc;
> +
> + if (!(goya->hw_cap_initialized & HW_CAP_CPU_Q))
> + return 0;
> +
> + armcp_info_cpu_addr =
> + hdev->asic_funcs->cpu_accessible_dma_pool_alloc(hdev,
> + sizeof(struct armcp_info), &armcp_info_dma_addr);
> + if (!armcp_info_cpu_addr) {
> + dev_err(hdev->dev,
> + "Failed to allocate DMA memory for ArmCP info packet\n");
> + return -ENOMEM;
> + }
> +
> + memset(armcp_info_cpu_addr, 0, sizeof(struct armcp_info));

Do you expect usage of cpu_accessible_dma_pool_alloc() without the need to
clear the memory?
If not memset(0) can be moved inside that function.

> +
> + memset(&pkt, 0, sizeof(pkt));
> +
> + pkt.opcode = ARMCP_PACKET_INFO_GET;
> + pkt.addr = armcp_info_dma_addr + prop->host_phys_base_address;
> + pkt.data_max_size = sizeof(struct armcp_info);
> +
> + rc = hdev->asic_funcs->send_cpu_message(hdev, (u32 *) &pkt, sizeof(pkt),
> + GOYA_ARMCP_INFO_TIMEOUT, &result);
> +
> + if (rc) {
> + dev_err(hdev->dev,
> + "Failed to send armcp info pkt, error %d\n", rc);
> + goto out;
> + }
> +
> + memcpy(&prop->armcp_info, armcp_info_cpu_addr,
> + sizeof(prop->armcp_info));
> +
> + dram_size = prop->armcp_info.dram_size;
> + if (dram_size) {
> + if ((!is_power_of_2(dram_size)) ||
> + (dram_size < DRAM_PHYS_DEFAULT_SIZE)) {
> + dev_err(hdev->dev,
> + "F/W reported invalid DRAM size %llu. Trying to use default size\n",
> + dram_size);
> + dram_size = DRAM_PHYS_DEFAULT_SIZE;
> + }
> +
> + prop->dram_size = dram_size;
> + prop->dram_end_address = prop->dram_base_address + dram_size;
> + }
> +
> + rc = hl_build_hwmon_channel_info(hdev, prop->armcp_info.sensors);
> + if (rc) {
> + dev_err(hdev->dev,
> + "Failed to build hwmon channel info, error %d\n", rc);
> + rc = -EFAULT;
> + goto out;
> + }
> +
> +out:
> + hdev->asic_funcs->cpu_accessible_dma_pool_free(hdev,
> + sizeof(struct armcp_info), armcp_info_cpu_addr);
> +
> + return rc;
> +}
> +
> +static void goya_init_clock_gating(struct hl_device *hdev)
> +{
> +
> +}
> +
> +static void goya_disable_clock_gating(struct hl_device *hdev)
> +{
> +
> +}
>
> static void goya_hw_queues_lock(struct hl_device *hdev)
> {
> @@ -4287,9 +4460,60 @@ static void goya_hw_queues_unlock(struct hl_device *hdev)
> spin_unlock(&goya->hw_queues_lock);
> }
>
> +int goya_get_eeprom_data(struct hl_device *hdev, void *data, size_t max_size)
> +{
> + struct goya_device *goya = hdev->asic_specific;
> + struct asic_fixed_properties *prop = &hdev->asic_prop;
> + struct armcp_packet pkt;
> + void *eeprom_info_cpu_addr;
> + dma_addr_t eeprom_info_dma_addr;
> + long result;
> + int rc;
> +
> + if (!(goya->hw_cap_initialized & HW_CAP_CPU_Q))
> + return 0;
> +
> + eeprom_info_cpu_addr =
> + hdev->asic_funcs->cpu_accessible_dma_pool_alloc(hdev,
> + max_size, &eeprom_info_dma_addr);
> + if (!eeprom_info_cpu_addr) {
> + dev_err(hdev->dev,
> + "Failed to allocate DMA memory for EEPROM info packet\n");
> + return -ENOMEM;
> + }
> +
> + memset(eeprom_info_cpu_addr, 0, max_size);
> +
> + memset(&pkt, 0, sizeof(pkt));
> +
> + pkt.opcode = ARMCP_PACKET_EEPROM_DATA_GET;
> + pkt.addr = eeprom_info_dma_addr + prop->host_phys_base_address;
> + pkt.data_max_size = max_size;
> +
> + rc = hdev->asic_funcs->send_cpu_message(hdev, (u32 *) &pkt, sizeof(pkt),
> + GOYA_ARMCP_EEPROM_TIMEOUT, &result);
> +
> + if (rc) {
> + dev_err(hdev->dev,
> + "Failed to send armcp EEPROM pkt, error %d\n", rc);
> + goto out;
> + }
> +
> + /* result contains the actual size */
> + memcpy(data, eeprom_info_cpu_addr, min((size_t)result, max_size));
> +
> +out:
> + hdev->asic_funcs->cpu_accessible_dma_pool_free(hdev, max_size,
> + eeprom_info_cpu_addr);
> +
> + return rc;
> +}
> +
> static const struct hl_asic_funcs goya_funcs = {
> .early_init = goya_early_init,
> .early_fini = goya_early_fini,
> + .late_init = goya_late_init,
> + .late_fini = goya_late_fini,
> .sw_init = goya_sw_init,
> .sw_fini = goya_sw_fini,
> .hw_init = goya_hw_init,
> @@ -4310,10 +4534,16 @@ static const struct hl_asic_funcs goya_funcs = {
> .cpu_accessible_dma_pool_alloc = goya_cpu_accessible_dma_pool_alloc,
> .cpu_accessible_dma_pool_free = goya_cpu_accessible_dma_pool_free,
> .update_eq_ci = goya_update_eq_ci,
> + .add_device_attr = goya_add_device_attr,
> + .remove_device_attr = goya_remove_device_attr,
> .handle_eqe = goya_handle_eqe,
> + .set_pll_profile = goya_set_pll_profile,
> .get_events_stat = goya_get_events_stat,
> + .enable_clock_gating = goya_init_clock_gating,
> + .disable_clock_gating = goya_disable_clock_gating,
> .hw_queues_lock = goya_hw_queues_lock,
> .hw_queues_unlock = goya_hw_queues_unlock,
> + .get_eeprom_data = goya_get_eeprom_data,
> .send_cpu_message = goya_send_cpu_message
> };
>
> diff --git a/drivers/misc/habanalabs/goya/goyaP.h b/drivers/misc/habanalabs/goya/goyaP.h
> index c6bfcb6c6905..42e8b1baef2f 100644
> --- a/drivers/misc/habanalabs/goya/goyaP.h
> +++ b/drivers/misc/habanalabs/goya/goyaP.h
> @@ -48,7 +48,10 @@
>
> #define PLL_HIGH_DEFAULT 1575000000 /* 1.575 GHz */
>
> +#define MAX_POWER_DEFAULT 200000 /* 200W */
> +
> #define GOYA_ARMCP_INFO_TIMEOUT 10000000 /* 10s */
> +#define GOYA_ARMCP_EEPROM_TIMEOUT 10000000 /* 10s */
>
> #define DRAM_PHYS_DEFAULT_SIZE 0x100000000ull /* 4GB */
>
> @@ -119,9 +122,15 @@ enum goya_fw_component {
>
> struct goya_device {
> int (*test_cpu_queue)(struct hl_device *hdev);
> + int (*armcp_info_get)(struct hl_device *hdev);
>
> /* TODO: remove hw_queues_lock after moving to scheduler code */
> spinlock_t hw_queues_lock;
> +
> + u64 mme_clk;
> + u64 tpc_clk;
> + u64 ic_clk;
> +
> u64 ddr_bar_cur_addr;
> u32 events_stat[GOYA_ASYNC_EVENT_ID_SIZE];
> u32 hw_cap_initialized;
> @@ -130,6 +139,18 @@ struct goya_device {
> int goya_test_cpu_queue(struct hl_device *hdev);
> int goya_send_cpu_message(struct hl_device *hdev, u32 *msg, u16 len,
> u32 timeout, long *result);
> +long goya_get_temperature(struct hl_device *hdev, int sensor_index, u32 attr);
> +long goya_get_voltage(struct hl_device *hdev, int sensor_index, u32 attr);
> +long goya_get_current(struct hl_device *hdev, int sensor_index, u32 attr);
> +long goya_get_fan_speed(struct hl_device *hdev, int sensor_index, u32 attr);
> +long goya_get_pwm_info(struct hl_device *hdev, int sensor_index, u32 attr);
> +void goya_set_pwm_info(struct hl_device *hdev, int sensor_index, u32 attr,
> + long value);
> +void goya_set_pll_profile(struct hl_device *hdev, enum hl_pll_frequency freq);
> +int goya_add_device_attr(struct hl_device *hdev);
> +void goya_remove_device_attr(struct hl_device *hdev);
> void goya_init_security(struct hl_device *hdev);
> +u64 goya_get_max_power(struct hl_device *hdev);
> +void goya_set_max_power(struct hl_device *hdev, u64 value);
>
> #endif /* GOYAP_H_ */
> diff --git a/drivers/misc/habanalabs/goya/goya_hwmgr.c b/drivers/misc/habanalabs/goya/goya_hwmgr.c
> new file mode 100644
> index 000000000000..866d1774b2e4
> --- /dev/null
> +++ b/drivers/misc/habanalabs/goya/goya_hwmgr.c
> @@ -0,0 +1,306 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +/*
> + * Copyright 2016-2018 HabanaLabs, Ltd.
> + * All Rights Reserved.
> + */
> +
> +#include "goyaP.h"
> +
> +void goya_set_pll_profile(struct hl_device *hdev, enum hl_pll_frequency freq)
> +{
> + struct goya_device *goya = hdev->asic_specific;
> +
> + switch (freq) {
> + case PLL_HIGH:
> + hl_set_frequency(hdev, MME_PLL, hdev->high_pll);
> + hl_set_frequency(hdev, TPC_PLL, hdev->high_pll);
> + hl_set_frequency(hdev, IC_PLL, hdev->high_pll);
> + break;
> + case PLL_LOW:
> + hl_set_frequency(hdev, MME_PLL, GOYA_PLL_FREQ_LOW);
> + hl_set_frequency(hdev, TPC_PLL, GOYA_PLL_FREQ_LOW);
> + hl_set_frequency(hdev, IC_PLL, GOYA_PLL_FREQ_LOW);
> + break;
> + case PLL_LAST:
> + hl_set_frequency(hdev, MME_PLL, goya->mme_clk);
> + hl_set_frequency(hdev, TPC_PLL, goya->tpc_clk);
> + hl_set_frequency(hdev, IC_PLL, goya->ic_clk);
> + break;
> + default:
> + dev_err(hdev->dev, "unknown frequency setting\n");
> + }
> +}
> +
> +static ssize_t mme_clk_show(struct device *dev, struct device_attribute *attr,
> + char *buf)
> +{
> + struct hl_device *hdev = dev_get_drvdata(dev);
> + long value;
> +
> + if (hdev->disabled)
> + return -ENODEV;
> +
> + value = hl_get_frequency(hdev, MME_PLL, false);
> +
> + if (value < 0)
> + return value;
> +
> + return snprintf(buf, PAGE_SIZE, "%lu\n", value);
> +}
> +
> +static ssize_t mme_clk_store(struct device *dev, struct device_attribute *attr,
> + const char *buf, size_t count)
> +{
> + struct hl_device *hdev = dev_get_drvdata(dev);
> + struct goya_device *goya = hdev->asic_specific;
> + int rc;
> + long value;
> +
> + if (hdev->disabled) {
> + count = -ENODEV;
> + goto fail;
> + }
> +
> + if (hdev->pm_mng_profile == PM_AUTO) {
> + count = -EPERM;
> + goto fail;
> + }
> +
> + rc = kstrtoul(buf, 0, &value);
> +
> + if (rc) {
> + count = -EINVAL;
> + goto fail;
> + }
> +
> + hl_set_frequency(hdev, MME_PLL, value);
> + goya->mme_clk = value;
> +
> +fail:
> + return count;
> +}
> +
> +static ssize_t tpc_clk_show(struct device *dev, struct device_attribute *attr,
> + char *buf)
> +{
> + struct hl_device *hdev = dev_get_drvdata(dev);
> + long value;
> +
> + if (hdev->disabled)
> + return -ENODEV;
> +
> + value = hl_get_frequency(hdev, TPC_PLL, false);
> +
> + if (value < 0)
> + return value;
> +
> + return snprintf(buf, PAGE_SIZE, "%lu\n", value);
> +}
> +
> +static ssize_t tpc_clk_store(struct device *dev, struct device_attribute *attr,
> + const char *buf, size_t count)
> +{
> + struct hl_device *hdev = dev_get_drvdata(dev);
> + struct goya_device *goya = hdev->asic_specific;
> + int rc;
> + long value;
> +
> + if (hdev->disabled) {
> + count = -ENODEV;
> + goto fail;
> + }
> +
> + if (hdev->pm_mng_profile == PM_AUTO) {
> + count = -EPERM;
> + goto fail;
> + }
> +
> + rc = kstrtoul(buf, 0, &value);
> +
> + if (rc) {
> + count = -EINVAL;
> + goto fail;
> + }
> +
> + hl_set_frequency(hdev, TPC_PLL, value);
> + goya->tpc_clk = value;
> +
> +fail:
> + return count;
> +}
> +
> +static ssize_t ic_clk_show(struct device *dev, struct device_attribute *attr,
> + char *buf)
> +{
> + struct hl_device *hdev = dev_get_drvdata(dev);
> + long value;
> +
> + if (hdev->disabled)
> + return -ENODEV;
> +
> + value = hl_get_frequency(hdev, IC_PLL, false);
> +
> + if (value < 0)
> + return value;
> +
> + return snprintf(buf, PAGE_SIZE, "%lu\n", value);
> +}
> +
> +static ssize_t ic_clk_store(struct device *dev, struct device_attribute *attr,
> + const char *buf, size_t count)
> +{
> + struct hl_device *hdev = dev_get_drvdata(dev);
> + struct goya_device *goya = hdev->asic_specific;
> + int rc;
> + long value;
> +
> + if (hdev->disabled) {
> + count = -ENODEV;
> + goto fail;
> + }
> +
> + if (hdev->pm_mng_profile == PM_AUTO) {
> + count = -EPERM;
> + goto fail;
> + }
> +
> + rc = kstrtoul(buf, 0, &value);
> +
> + if (rc) {
> + count = -EINVAL;
> + goto fail;
> + }
> +
> + hl_set_frequency(hdev, IC_PLL, value);
> + goya->ic_clk = value;
> +
> +fail:
> + return count;
> +}
> +
> +static ssize_t mme_clk_curr_show(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct hl_device *hdev = dev_get_drvdata(dev);
> + long value;
> +
> + if (hdev->disabled)
> + return -ENODEV;
> +
> + value = hl_get_frequency(hdev, MME_PLL, true);
> +
> + if (value < 0)
> + return value;
> +
> + return snprintf(buf, PAGE_SIZE, "%lu\n", value);
> +}
> +
> +static ssize_t tpc_clk_curr_show(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct hl_device *hdev = dev_get_drvdata(dev);
> + long value;
> +
> + if (hdev->disabled)
> + return -ENODEV;
> +
> + value = hl_get_frequency(hdev, TPC_PLL, true);
> +
> + if (value < 0)
> + return value;
> +
> + return snprintf(buf, PAGE_SIZE, "%lu\n", value);
> +}
> +
> +static ssize_t ic_clk_curr_show(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct hl_device *hdev = dev_get_drvdata(dev);
> + long value;
> +
> + if (hdev->disabled)
> + return -ENODEV;
> +
> + value = hl_get_frequency(hdev, IC_PLL, true);
> +
> + if (value < 0)
> + return value;
> +
> + return snprintf(buf, PAGE_SIZE, "%lu\n", value);
> +}
> +
> +static DEVICE_ATTR_RW(mme_clk);
> +static DEVICE_ATTR_RW(tpc_clk);
> +static DEVICE_ATTR_RW(ic_clk);
> +static DEVICE_ATTR_RO(mme_clk_curr);
> +static DEVICE_ATTR_RO(tpc_clk_curr);
> +static DEVICE_ATTR_RO(ic_clk_curr);
> +
> +int goya_add_device_attr(struct hl_device *hdev)
> +{
> + int rc;
> +
> + rc = device_create_file(hdev->dev, &dev_attr_mme_clk);
> + if (rc) {
> + dev_err(hdev->dev, "failed to create device file mme_clk\n");
> + return rc;
> + }
> +
> + rc = device_create_file(hdev->dev, &dev_attr_tpc_clk);
> + if (rc) {
> + dev_err(hdev->dev, "failed to create device file tpc_clk\n");
> + goto remove_mme_clk;
> + }
> +
> + rc = device_create_file(hdev->dev, &dev_attr_ic_clk);
> + if (rc) {
> + dev_err(hdev->dev, "failed to create device file ic_clk\n");
> + goto remove_tpc_clk;
> + }
> +
> + rc = device_create_file(hdev->dev, &dev_attr_mme_clk_curr);
> + if (rc) {
> + dev_err(hdev->dev,
> + "failed to create device file mme_clk_curr\n");
> + goto remove_ic_clk;
> + }
> +
> + rc = device_create_file(hdev->dev, &dev_attr_tpc_clk_curr);
> + if (rc) {
> + dev_err(hdev->dev,
> + "failed to create device file tpc_clk_curr\n");
> + goto remove_mme_clk_curr;
> + }
> +
> + rc = device_create_file(hdev->dev, &dev_attr_ic_clk_curr);
> + if (rc) {
> + dev_err(hdev->dev,
> + "failed to create device file ic_clk_curr\n");
> + goto remove_tpc_clk_curr;
> + }
> +
> + return 0;
> +
> +remove_tpc_clk_curr:
> + device_remove_file(hdev->dev, &dev_attr_tpc_clk_curr);
> +remove_mme_clk_curr:
> + device_remove_file(hdev->dev, &dev_attr_mme_clk_curr);
> +remove_ic_clk:
> + device_remove_file(hdev->dev, &dev_attr_ic_clk);
> +remove_tpc_clk:
> + device_remove_file(hdev->dev, &dev_attr_tpc_clk);
> +remove_mme_clk:
> + device_remove_file(hdev->dev, &dev_attr_mme_clk);
> + return rc;
> +}
> +
> +void goya_remove_device_attr(struct hl_device *hdev)
> +{
> + device_remove_file(hdev->dev, &dev_attr_ic_clk_curr);
> + device_remove_file(hdev->dev, &dev_attr_tpc_clk_curr);
> + device_remove_file(hdev->dev, &dev_attr_mme_clk_curr);
> + device_remove_file(hdev->dev, &dev_attr_ic_clk);
> + device_remove_file(hdev->dev, &dev_attr_tpc_clk);
> + device_remove_file(hdev->dev, &dev_attr_mme_clk);
> +}
> diff --git a/drivers/misc/habanalabs/habanalabs.h b/drivers/misc/habanalabs/habanalabs.h
> index 899bf98eb002..49b84b3ff864 100644
> --- a/drivers/misc/habanalabs/habanalabs.h
> +++ b/drivers/misc/habanalabs/habanalabs.h
> @@ -25,6 +25,8 @@
>
> #define HL_DEVICE_TIMEOUT_USEC 1000000 /* 1 s */
>
> +#define HL_PLL_LOW_JOB_FREQ_USEC 5000000 /* 5 s */
> +
> #define HL_MAX_QUEUES 128
>
> struct hl_device;
> @@ -60,6 +62,8 @@ struct hw_queue_properties {
> /**
> * struct asic_fixed_properties - ASIC specific immutable properties.
> * @hw_queues_props: H/W queues properties.
> + * @armcp_info: received various information from ArmCP regarding the H/W. e.g.
> + * available sensors.
> * @uboot_ver: F/W U-boot version.
> * @preboot_ver: F/W Preboot version.
> * @sram_base_address: SRAM physical start address.
> @@ -72,6 +76,7 @@ struct hw_queue_properties {
> * @dram_pci_bar_size: size of PCI bar towards DRAM.
> * @host_phys_base_address: base physical address of host memory for
> * transactions that the device generates.
> + * @max_power_default: max power of the device after reset
> * @va_space_host_start_address: base address of virtual memory range for
> * mapping host memory.
> * @va_space_host_end_address: end address of virtual memory range for
> @@ -84,6 +89,10 @@ struct hw_queue_properties {
> * @sram_size: total size of SRAM.
> * @max_asid: maximum number of open contexts (ASIDs).
> * @num_of_events: number of possible internal H/W IRQs.
> + * @psoc_pci_pll_nr: PCI PLL NR value.
> + * @psoc_pci_pll_nf: PCI PLL NF value.
> + * @psoc_pci_pll_od: PCI PLL OD value.
> + * @psoc_pci_pll_div_factor: PCI PLL DIV FACTOR 1 value.
> * @completion_queues_count: number of completion queues.
> * @high_pll: high PLL frequency used by the device.
> * @cb_pool_cb_cnt: number of CBs in the CB pool.
> @@ -92,6 +101,7 @@ struct hw_queue_properties {
> */
> struct asic_fixed_properties {
> struct hw_queue_properties hw_queues_props[HL_MAX_QUEUES];
> + struct armcp_info armcp_info;
> char uboot_ver[VERSION_MAX_LEN];
> char preboot_ver[VERSION_MAX_LEN];
> u64 sram_base_address;
> @@ -103,6 +113,7 @@ struct asic_fixed_properties {
> u64 dram_size;
> u64 dram_pci_bar_size;
> u64 host_phys_base_address;
> + u64 max_power_default;
> u64 va_space_host_start_address;
> u64 va_space_host_end_address;
> u64 va_space_dram_start_address;
> @@ -111,6 +122,10 @@ struct asic_fixed_properties {
> u32 sram_size;
> u32 max_asid;
> u32 num_of_events;
> + u32 psoc_pci_pll_nr;
> + u32 psoc_pci_pll_nf;
> + u32 psoc_pci_pll_od;
> + u32 psoc_pci_pll_div_factor;
> u32 high_pll;
> u32 cb_pool_cb_cnt;
> u32 cb_pool_cb_size;
> @@ -296,13 +311,37 @@ enum hl_asic_type {
> };
>
>
> +/**
> + * enum hl_pm_mng_profile - power management profile.
> + * @PM_AUTO: internal clock is set by KMD.
> + * @PM_MANUAL: internal clock is set by the user.
> + * @PM_LAST: last power management type.
> + */
> +enum hl_pm_mng_profile {
> + PM_AUTO = 1,
> + PM_MANUAL,
> + PM_LAST
> +};
>
> +/**
> + * enum hl_pll_frequency - PLL frequency.
> + * @PLL_HIGH: high frequency.
> + * @PLL_LOW: low frequency.
> + * @PLL_LAST: last frequency values that were configured by the user.
> + */
> +enum hl_pll_frequency {
> + PLL_HIGH = 1,
> + PLL_LOW,
> + PLL_LAST
> +};
>
> /**
> * struct hl_asic_funcs - ASIC specific functions that are can be called from
> * common code.
> * @early_init: sets up early driver state (pre sw_init), doesn't configure H/W.
> * @early_fini: tears down what was done in early_init.
> + * @late_init: sets up late driver/hw state (post hw_init) - Optional.
> + * @late_fini: tears down what was done in late_init (pre hw_fini) - Optional.
> * @sw_init: sets up driver state, does not configure H/W.
> * @sw_fini: tears down driver state, does not configure H/W.
> * @hw_init: sets up the H/W state.
> @@ -326,15 +365,23 @@ enum hl_asic_type {
> * @cpu_accessible_dma_pool_alloc: allocate CPU PQ packet from DMA pool.
> * @cpu_accessible_dma_pool_free: free CPU PQ packet from DMA pool.
> * @update_eq_ci: update event queue CI.
> + * @add_device_attr: add ASIC specific device attributes.
> + * @remove_device_attr: remove ASIC specific device attributes.
> * @handle_eqe: handle event queue entry (IRQ) from ArmCP.
> + * @set_pll_profile: change PLL profile (manual/automatic).
> * @get_events_stat: retrieve event queue entries histogram.
> + * @enable_clock_gating: enable clock gating for reducing power consumption.
> + * @disable_clock_gating: disable clock for accessing registers on HBW.
> * @hw_queues_lock: acquire H/W queues lock.
> * @hw_queues_unlock: release H/W queues lock.
> + * @get_eeprom_data: retrieve EEPROM data from F/W.
> * @send_cpu_message: send buffer to ArmCP.
> */
> struct hl_asic_funcs {
> int (*early_init)(struct hl_device *hdev);
> int (*early_fini)(struct hl_device *hdev);
> + int (*late_init)(struct hl_device *hdev);
> + void (*late_fini)(struct hl_device *hdev);
> int (*sw_init)(struct hl_device *hdev);
> int (*sw_fini)(struct hl_device *hdev);
> int (*hw_init)(struct hl_device *hdev);
> @@ -363,11 +410,19 @@ struct hl_asic_funcs {
> void (*cpu_accessible_dma_pool_free)(struct hl_device *hdev,
> size_t size, void *vaddr);
> void (*update_eq_ci)(struct hl_device *hdev, u32 val);
> + int (*add_device_attr)(struct hl_device *hdev);
> + void (*remove_device_attr)(struct hl_device *hdev);
> void (*handle_eqe)(struct hl_device *hdev,
> struct hl_eq_entry *eq_entry);
> + void (*set_pll_profile)(struct hl_device *hdev,
> + enum hl_pll_frequency freq);
> void* (*get_events_stat)(struct hl_device *hdev, u32 *size);
> + void (*enable_clock_gating)(struct hl_device *hdev);
> + void (*disable_clock_gating)(struct hl_device *hdev);
> void (*hw_queues_lock)(struct hl_device *hdev);
> void (*hw_queues_unlock)(struct hl_device *hdev);
> + int (*get_eeprom_data)(struct hl_device *hdev, void *data,
> + size_t max_size);
> int (*send_cpu_message)(struct hl_device *hdev, u32 *msg,
> u16 len, u32 timeout, long *result);
> };
> @@ -496,6 +551,7 @@ void hl_wreg(struct hl_device *hdev, u32 reg, u32 val);
> * @rmmio: configuration area address on SRAM.
> * @cdev: related char device.
> * @dev: realted kernel basic device structure.
> + * @work_freq: delayed work to lower device frequency if possible.
> * @asic_name: ASIC specific nmae.
> * @asic_type: ASIC specific type.
> * @completion_queue: array of hl_cq.
> @@ -517,13 +573,23 @@ void hl_wreg(struct hl_device *hdev, u32 reg, u32 val);
> * @asic_prop: ASIC specific immutable properties.
> * @asic_funcs: ASIC specific functions.
> * @asic_specific: ASIC specific information to use only from ASIC files.
> + * @hwmon_dev: H/W monitor device.
> + * @pm_mng_profile: current power management profile.
> + * @hl_chip_info: ASIC's sensors information.
> * @cb_pool: list of preallocated CBs.
> * @cb_pool_lock: protects the CB pool.
> * @user_ctx: current user context executing.
> + * @curr_pll_profile: current PLL profile.
> * @fd_open_cnt: number of open context executing.
> + * @max_power: the max power of the device, as configured by the sysadmin. This
> + * value is saved so in case of hard-reset, KMD will restore this
> + * value and update the F/W after the re-initialization
> * @major: habanalabs KMD major.
> + * @high_pll: high PLL profile frequency.
> * @id: device minor.
> * @disabled: is device disabled.
> + * @late_init_done: is late init stage was done during initialization.
> + * @hwmon_initialized: is H/W monitor sensors was initialized.
> */
> struct hl_device {
> struct pci_dev *pdev;
> @@ -531,6 +597,7 @@ struct hl_device {
> void __iomem *rmmio;
> struct cdev cdev;
> struct device *dev;
> + struct delayed_work work_freq;
> char asic_name[16];
> enum hl_asic_type asic_type;
> struct hl_cq *completion_queue;
> @@ -553,16 +620,25 @@ struct hl_device {
> struct asic_fixed_properties asic_prop;
> const struct hl_asic_funcs *asic_funcs;
> void *asic_specific;
> + struct device *hwmon_dev;
> + enum hl_pm_mng_profile pm_mng_profile;
> + struct hwmon_chip_info hl_chip_info;
>
> struct list_head cb_pool;
> spinlock_t cb_pool_lock;
>
> /* TODO: The following fields should be moved for multi-context */
> struct hl_ctx *user_ctx;
> +
> + atomic_t curr_pll_profile;
> atomic_t fd_open_cnt;
> + u64 max_power;
> u32 major;
> + u32 high_pll;
> u16 id;
> u8 disabled;
> + u8 late_init_done;
> + u8 hwmon_initialized;
>
> /* Parameters for bring-up */
> u8 cpu_enable;
> @@ -647,6 +723,15 @@ int hl_device_suspend(struct hl_device *hdev);
> int hl_device_resume(struct hl_device *hdev);
> void hl_hpriv_get(struct hl_fpriv *hpriv);
> void hl_hpriv_put(struct hl_fpriv *hpriv);
> +int hl_device_set_frequency(struct hl_device *hdev, enum hl_pll_frequency freq);
> +int hl_build_hwmon_channel_info(struct hl_device *hdev,
> + struct armcp_sensor *sensors_arr);
> +
> +int hl_sysfs_init(struct hl_device *hdev);
> +void hl_sysfs_fini(struct hl_device *hdev);
> +
> +int hl_hwmon_init(struct hl_device *hdev);
> +void hl_hwmon_fini(struct hl_device *hdev);
>
> int hl_cb_create(struct hl_device *hdev, struct hl_cb_mgr *mgr, u32 cb_size,
> u64 *handle, int ctx_id);
> @@ -663,6 +748,18 @@ int hl_cb_pool_fini(struct hl_device *hdev);
>
> void goya_set_asic_funcs(struct hl_device *hdev);
>
> +long hl_get_frequency(struct hl_device *hdev, u32 pll_index, bool curr);
> +void hl_set_frequency(struct hl_device *hdev, u32 pll_index, u64 freq);
> +long hl_get_temperature(struct hl_device *hdev, int sensor_index, u32 attr);
> +long hl_get_voltage(struct hl_device *hdev, int sensor_index, u32 attr);
> +long hl_get_current(struct hl_device *hdev, int sensor_index, u32 attr);
> +long hl_get_fan_speed(struct hl_device *hdev, int sensor_index, u32 attr);
> +long hl_get_pwm_info(struct hl_device *hdev, int sensor_index, u32 attr);
> +void hl_set_pwm_info(struct hl_device *hdev, int sensor_index, u32 attr,
> + long value);
> +u64 hl_get_max_power(struct hl_device *hdev);
> +void hl_set_max_power(struct hl_device *hdev, u64 value);
> +
> /* IOCTLs */
> long hl_ioctl(struct file *filep, unsigned int cmd, unsigned long arg);
> int hl_cb_ioctl(struct hl_fpriv *hpriv, void *data);
> diff --git a/drivers/misc/habanalabs/habanalabs_drv.c b/drivers/misc/habanalabs/habanalabs_drv.c
> index b64f58ad0f5d..47a9ab458b43 100644
> --- a/drivers/misc/habanalabs/habanalabs_drv.c
> +++ b/drivers/misc/habanalabs/habanalabs_drv.c
> @@ -134,6 +134,13 @@ int hl_device_open(struct inode *inode, struct file *filp)
>
> hpriv->taskpid = find_get_pid(current->pid);
>
> + /*
> + * Device is IDLE at this point so it is legal to change PLLs. There
> + * is no need to check anything because if the PLL is already HIGH, the
> + * set function will return without doing anything
> + */
> + hl_device_set_frequency(hdev, PLL_HIGH);
> +
> return 0;
>
> out_err:
> diff --git a/drivers/misc/habanalabs/hwmon.c b/drivers/misc/habanalabs/hwmon.c
> new file mode 100644
> index 000000000000..6ca0decb7490
> --- /dev/null
> +++ b/drivers/misc/habanalabs/hwmon.c
> @@ -0,0 +1,449 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +/*
> + * Copyright 2016-2018 HabanaLabs, Ltd.
> + * All Rights Reserved.
> + */
> +
> +#include "habanalabs.h"
> +
> +#define SENSORS_PKT_TIMEOUT 100000 /* 100ms */
> +#define HWMON_NR_SENSOR_TYPES (hwmon_pwm + 1)
> +
> +int hl_build_hwmon_channel_info(struct hl_device *hdev,
> + struct armcp_sensor *sensors_arr)
> +{
> + u32 counts[HWMON_NR_SENSOR_TYPES] = {0};
> + u32 *sensors_by_type[HWMON_NR_SENSOR_TYPES] = {0};
> + u32 sensors_by_type_next_index[HWMON_NR_SENSOR_TYPES] = {0};
> + struct hwmon_channel_info **channels_info;
> + u32 num_sensors_for_type, num_active_sensor_types = 0,
> + arr_size = 0, *curr_arr;
> + enum hwmon_sensor_types type;
> + int rc, i, j;
> +
> + for (i = 0 ; i < ARMCP_MAX_SENSORS ; i++) {
> + type = sensors_arr[i].type;
> +
> + if ((type == 0) && (sensors_arr[i].flags == 0))
> + break;
> +
> + if (type >= HWMON_NR_SENSOR_TYPES) {
> + dev_err(hdev->dev,
> + "Got wrong sensor type %d from device\n", type);
> + return -EINVAL;
> + }
> +
> + counts[type]++;
> + arr_size++;
> + }
> +
> + for (i = 0 ; i < HWMON_NR_SENSOR_TYPES ; i++) {
> + if (counts[i] == 0)
> + continue;
> +
> + num_sensors_for_type = counts[i] + 1;
> + curr_arr = kcalloc(num_sensors_for_type, sizeof(*curr_arr),
> + GFP_KERNEL);
> + if (!curr_arr) {
> + rc = -ENOMEM;
> + goto sensors_type_err;
> + }
> +
> + num_active_sensor_types++;
> + sensors_by_type[i] = curr_arr;
> + }
> +
> + for (i = 0 ; i < arr_size ; i++) {
> + type = sensors_arr[i].type;
> + curr_arr = sensors_by_type[type];
> + curr_arr[sensors_by_type_next_index[type]++] =
> + sensors_arr[i].flags;
> + }
> +
> + channels_info = kcalloc(num_active_sensor_types + 1,
> + sizeof(*channels_info), GFP_KERNEL);
> + if (!channels_info) {
> + rc = -ENOMEM;
> + goto channels_info_array_err;
> + }
> +
> + for (i = 0 ; i < num_active_sensor_types ; i++) {
> + channels_info[i] = kzalloc(sizeof(*channels_info[i]),
> + GFP_KERNEL);
> + if (!channels_info[i]) {
> + rc = -ENOMEM;
> + goto channel_info_err;
> + }
> + }
> +
> + for (i = 0, j = 0 ; i < HWMON_NR_SENSOR_TYPES ; i++) {
> + if (!sensors_by_type[i])
> + continue;
> +
> + channels_info[j]->type = i;
> + channels_info[j]->config = sensors_by_type[i];
> + j++;
> + }
> +
> + hdev->hl_chip_info.info =
> + (const struct hwmon_channel_info **)channels_info;
> +
> + return 0;
> +
> +channel_info_err:
> + for (i = 0 ; i < num_active_sensor_types ; i++)
> + if (channels_info[i]) {
> + kfree(channels_info[i]->config);
> + kfree(channels_info[i]);
> + }
> + kfree(channels_info);
> +channels_info_array_err:
> +sensors_type_err:
> + for (i = 0 ; i < HWMON_NR_SENSOR_TYPES ; i++)
> + kfree(sensors_by_type[i]);
> +
> + return rc;
> +}
> +
> +static int hl_read(struct device *dev, enum hwmon_sensor_types type,
> + u32 attr, int channel, long *val)
> +{
> + struct hl_device *hdev = dev_get_drvdata(dev);
> +
> + if (hdev->disabled)
> + return -ENODEV;
> +
> + switch (type) {
> + case hwmon_temp:
> + switch (attr) {
> + case hwmon_temp_input:
> + case hwmon_temp_max:
> + case hwmon_temp_crit:
> + case hwmon_temp_max_hyst:
> + case hwmon_temp_crit_hyst:
> + break;
> + default:
> + return -EINVAL;
> + }
> +
> + *val = hl_get_temperature(hdev, channel, attr);
> + break;
> + case hwmon_in:
> + switch (attr) {
> + case hwmon_in_input:
> + case hwmon_in_min:
> + case hwmon_in_max:
> + break;
> + default:
> + return -EINVAL;
> + }
> +
> + *val = hl_get_voltage(hdev, channel, attr);
> + break;
> + case hwmon_curr:
> + switch (attr) {
> + case hwmon_curr_input:
> + case hwmon_curr_min:
> + case hwmon_curr_max:
> + break;
> + default:
> + return -EINVAL;
> + }
> +
> + *val = hl_get_current(hdev, channel, attr);
> + break;
> + case hwmon_fan:
> + switch (attr) {
> + case hwmon_fan_input:
> + case hwmon_fan_min:
> + case hwmon_fan_max:
> + break;
> + default:
> + return -EINVAL;
> + }
> + *val = hl_get_fan_speed(hdev, channel, attr);
> + break;
> + case hwmon_pwm:
> + switch (attr) {
> + case hwmon_pwm_input:
> + case hwmon_pwm_enable:
> + break;
> + default:
> + return -EINVAL;
> + }
> + *val = hl_get_pwm_info(hdev, channel, attr);
> + break;
> + default:
> + return -EINVAL;
> + }
> + return 0;
> +}
> +
> +static int hl_write(struct device *dev, enum hwmon_sensor_types type,
> + u32 attr, int channel, long val)
> +{
> + struct hl_device *hdev = dev_get_drvdata(dev);
> +
> + if (hdev->disabled)
> + return -ENODEV;
> +
> + switch (type) {
> + case hwmon_pwm:
> + switch (attr) {
> + case hwmon_pwm_input:
> + case hwmon_pwm_enable:
> + break;
> + default:
> + return -EINVAL;
> + }
> + hl_set_pwm_info(hdev, channel, attr, val);
> + break;
> + default:
> + return -EINVAL;
> + }
> + return 0;
> +}
> +
> +static umode_t hl_is_visible(const void *data, enum hwmon_sensor_types type,
> + u32 attr, int channel)
> +{
> + switch (type) {
> + case hwmon_temp:
> + switch (attr) {
> + case hwmon_temp_input:
> + case hwmon_temp_max:
> + case hwmon_temp_max_hyst:
> + case hwmon_temp_crit:
> + case hwmon_temp_crit_hyst:
> + return 0444;
> + }
> + break;
> + case hwmon_in:
> + switch (attr) {
> + case hwmon_in_input:
> + case hwmon_in_min:
> + case hwmon_in_max:
> + return 0444;
> + }
> + break;
> + case hwmon_curr:
> + switch (attr) {
> + case hwmon_curr_input:
> + case hwmon_curr_min:
> + case hwmon_curr_max:
> + return 0444;
> + }
> + break;
> + case hwmon_fan:
> + switch (attr) {
> + case hwmon_fan_input:
> + case hwmon_fan_min:
> + case hwmon_fan_max:
> + return 0444;
> + }
> + break;
> + case hwmon_pwm:
> + switch (attr) {
> + case hwmon_pwm_input:
> + case hwmon_pwm_enable:
> + return 0644;
> + }
> + break;
> + default:
> + break;
> + }
> + return 0;
> +}
> +
> +static const struct hwmon_ops hl_hwmon_ops = {
> + .is_visible = hl_is_visible,
> + .read = hl_read,
> + .write = hl_write
> +};
> +
> +long hl_get_temperature(struct hl_device *hdev, int sensor_index, u32 attr)
> +{
> + struct armcp_packet pkt;
> + long result;
> + int rc;
> +
> + memset(&pkt, 0, sizeof(pkt));
> +
> + pkt.opcode = ARMCP_PACKET_TEMPERATURE_GET;
> + pkt.sensor_index = sensor_index;
> + pkt.type = attr;
> +
> + rc = hdev->asic_funcs->send_cpu_message(hdev, (u32 *) &pkt, sizeof(pkt),
> + SENSORS_PKT_TIMEOUT, &result);
> +
> + if (rc) {
> + dev_err(hdev->dev,
> + "Failed to get temperature from sensor %d, error %d\n",
> + sensor_index, rc);
> + result = 0;
> + }
> +
> + return result;
> +}
> +
> +long hl_get_voltage(struct hl_device *hdev, int sensor_index, u32 attr)
> +{
> + struct armcp_packet pkt;
> + long result;
> + int rc;
> +
> + memset(&pkt, 0, sizeof(pkt));
> +
> + pkt.opcode = ARMCP_PACKET_VOLTAGE_GET;
> + pkt.sensor_index = sensor_index;
> + pkt.type = attr;
> +
> + rc = hdev->asic_funcs->send_cpu_message(hdev, (u32 *) &pkt, sizeof(pkt),
> + SENSORS_PKT_TIMEOUT, &result);
> +
> + if (rc) {
> + dev_err(hdev->dev,
> + "Failed to get voltage from sensor %d, error %d\n",
> + sensor_index, rc);
> + result = 0;
> + }
> +
> + return result;
> +}
> +
> +long hl_get_current(struct hl_device *hdev, int sensor_index, u32 attr)
> +{
> + struct armcp_packet pkt;
> + long result;
> + int rc;
> +
> + memset(&pkt, 0, sizeof(pkt));
> +
> + pkt.opcode = ARMCP_PACKET_CURRENT_GET;
> + pkt.sensor_index = sensor_index;
> + pkt.type = attr;
> +
> + rc = hdev->asic_funcs->send_cpu_message(hdev, (u32 *) &pkt, sizeof(pkt),
> + SENSORS_PKT_TIMEOUT, &result);
> +
> + if (rc) {
> + dev_err(hdev->dev,
> + "Failed to get current from sensor %d, error %d\n",
> + sensor_index, rc);
> + result = 0;
> + }
> +
> + return result;
> +}
> +
> +long hl_get_fan_speed(struct hl_device *hdev, int sensor_index, u32 attr)
> +{
> + struct armcp_packet pkt;
> + long result;
> + int rc;
> +
> + memset(&pkt, 0, sizeof(pkt));
> +
> + pkt.opcode = ARMCP_PACKET_FAN_SPEED_GET;
> + pkt.sensor_index = sensor_index;
> + pkt.type = attr;
> +
> + rc = hdev->asic_funcs->send_cpu_message(hdev, (u32 *) &pkt, sizeof(pkt),
> + SENSORS_PKT_TIMEOUT, &result);
> +
> + if (rc) {
> + dev_err(hdev->dev,
> + "Failed to get fan speed from sensor %d, error %d\n",
> + sensor_index, rc);
> + result = 0;
> + }
> +
> + return result;
> +}
> +
> +long hl_get_pwm_info(struct hl_device *hdev, int sensor_index, u32 attr)
> +{
> + struct armcp_packet pkt;
> + long result;
> + int rc;
> +
> + memset(&pkt, 0, sizeof(pkt));
> +
> + pkt.opcode = ARMCP_PACKET_PWM_GET;
> + pkt.sensor_index = sensor_index;
> + pkt.type = attr;
> +
> + rc = hdev->asic_funcs->send_cpu_message(hdev, (u32 *) &pkt, sizeof(pkt),
> + SENSORS_PKT_TIMEOUT, &result);
> +
> + if (rc) {
> + dev_err(hdev->dev,
> + "Failed to get pwm info from sensor %d, error %d\n",
> + sensor_index, rc);
> + result = 0;
> + }
> +
> + return result;
> +}
> +
> +void hl_set_pwm_info(struct hl_device *hdev, int sensor_index, u32 attr,
> + long value)
> +{
> + struct armcp_packet pkt;
> + int rc;
> +
> + memset(&pkt, 0, sizeof(pkt));
> +
> + pkt.opcode = ARMCP_PACKET_PWM_SET;
> + pkt.sensor_index = sensor_index;
> + pkt.type = attr;
> + pkt.value = value;
> +
> + rc = hdev->asic_funcs->send_cpu_message(hdev, (u32 *) &pkt, sizeof(pkt),
> + SENSORS_PKT_TIMEOUT, NULL);
> +
> + if (rc)
> + dev_err(hdev->dev,
> + "Failed to set pwm info to sensor %d, error %d\n",
> + sensor_index, rc);
> +}
> +
> +int hl_hwmon_init(struct hl_device *hdev)
> +{
> + struct device *dev = hdev->pdev ? &hdev->pdev->dev : hdev->dev;
> + int rc;
> +
> + if ((hdev->hwmon_initialized) || !(hdev->fw_loading))
> + return 0;
> +
> + if (hdev->hl_chip_info.info) {
> + hdev->hl_chip_info.ops = &hl_hwmon_ops;
> +
> + hdev->hwmon_dev = hwmon_device_register_with_info(dev,
> + "habanalabs", hdev, &hdev->hl_chip_info, NULL);
> + if (IS_ERR(hdev->hwmon_dev)) {
> + rc = PTR_ERR(hdev->hwmon_dev);
> + dev_err(hdev->dev,
> + "Unable to register hwmon device: %d\n", rc);
> + return rc;
> + }
> +
> + dev_info(hdev->dev, "%s: add sensors information\n",
> + dev_name(hdev->hwmon_dev));
> +
> + hdev->hwmon_initialized = true;
> + } else {
> + dev_info(hdev->dev, "no available sensors\n");
> + }
> +
> + return 0;
> +}
> +
> +void hl_hwmon_fini(struct hl_device *hdev)
> +{
> + if (!hdev->hwmon_initialized)
> + return;
> +
> + hwmon_device_unregister(hdev->hwmon_dev);
> +}
> diff --git a/drivers/misc/habanalabs/sysfs.c b/drivers/misc/habanalabs/sysfs.c
> new file mode 100644
> index 000000000000..edd5f7159de0
> --- /dev/null
> +++ b/drivers/misc/habanalabs/sysfs.c
> @@ -0,0 +1,588 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +/*
> + * Copyright 2016-2018 HabanaLabs, Ltd.
> + * All Rights Reserved.
> + */
> +
> +#include "habanalabs.h"
> +#include "include/habanalabs_device_if.h"
> +
> +#include <linux/hwmon-sysfs.h>
> +#include <linux/hwmon.h>
> +
> +#define SET_CLK_PKT_TIMEOUT 200000 /* 200ms */
> +#define SET_PWR_PKT_TIMEOUT 400000 /* 400ms */
> +
> +long hl_get_frequency(struct hl_device *hdev, u32 pll_index, bool curr)
> +{
> + struct armcp_packet pkt;
> + long result;
> + int rc;
> +
> + memset(&pkt, 0, sizeof(pkt));
> +
> + if (curr)
> + pkt.opcode = ARMCP_PACKET_FREQUENCY_CURR_GET;
> + else
> + pkt.opcode = ARMCP_PACKET_FREQUENCY_GET;
> + pkt.pll_index = pll_index;
> +
> + rc = hdev->asic_funcs->send_cpu_message(hdev, (u32 *) &pkt, sizeof(pkt),
> + SET_CLK_PKT_TIMEOUT, &result);
> +
> + if (rc) {
> + dev_err(hdev->dev,
> + "Failed to get frequency of PLL %d, error %d\n",
> + pll_index, rc);
> + result = rc;
> + }
> +
> + return result;
> +}
> +
> +void hl_set_frequency(struct hl_device *hdev, u32 pll_index, u64 freq)
> +{
> + struct armcp_packet pkt;
> + int rc;
> +
> + memset(&pkt, 0, sizeof(pkt));
> +
> + pkt.opcode = ARMCP_PACKET_FREQUENCY_SET;
> + pkt.pll_index = pll_index;
> + pkt.value = freq;
> +
> + rc = hdev->asic_funcs->send_cpu_message(hdev, (u32 *) &pkt, sizeof(pkt),
> + SET_CLK_PKT_TIMEOUT, NULL);
> +
> + if (rc)
> + dev_err(hdev->dev,
> + "Failed to set frequency to PLL %d, error %d\n",
> + pll_index, rc);
> +}
> +
> +u64 hl_get_max_power(struct hl_device *hdev)
> +{
> + struct armcp_packet pkt;
> + long result;
> + int rc;
> +
> + memset(&pkt, 0, sizeof(pkt));
> +
> + pkt.opcode = ARMCP_PACKET_MAX_POWER_GET;
> +
> + rc = hdev->asic_funcs->send_cpu_message(hdev, (u32 *) &pkt, sizeof(pkt),
> + SET_PWR_PKT_TIMEOUT, &result);
> +
> + if (rc) {
> + dev_err(hdev->dev, "Failed to get max power, error %d\n", rc);
> + result = rc;
> + }
> +
> + return result;
> +}
> +
> +void hl_set_max_power(struct hl_device *hdev, u64 value)
> +{
> + struct armcp_packet pkt;
> + int rc;
> +
> + memset(&pkt, 0, sizeof(pkt));
> +
> + pkt.opcode = ARMCP_PACKET_MAX_POWER_SET;
> + pkt.value = value;
> +
> + rc = hdev->asic_funcs->send_cpu_message(hdev, (u32 *) &pkt, sizeof(pkt),
> + SET_PWR_PKT_TIMEOUT, NULL);
> +
> + if (rc)
> + dev_err(hdev->dev, "Failed to set max power, error %d\n", rc);
> +}
> +
> +static ssize_t pm_mng_profile_show(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct hl_device *hdev = dev_get_drvdata(dev);
> +
> + if (hdev->disabled)
> + return -ENODEV;
> +
> + return snprintf(buf, PAGE_SIZE, "%s\n",
> + (hdev->pm_mng_profile == PM_AUTO) ? "auto" :
> + (hdev->pm_mng_profile == PM_MANUAL) ? "manual" :
> + "unknown");
> +}
> +
> +static ssize_t pm_mng_profile_store(struct device *dev,
> + struct device_attribute *attr, const char *buf, size_t count)
> +{
> + struct hl_device *hdev = dev_get_drvdata(dev);
> +
> + if (hdev->disabled) {
> + count = -ENODEV;
> + goto out;
> + }
> +
> + mutex_lock(&hdev->device_open);
> +
> + if (atomic_read(&hdev->fd_open_cnt) > 0) {
> + dev_err(hdev->dev,
> + "Can't change PM profile while user process is opened on the device\n");
> + count = -EPERM;
> + goto unlock_mutex;
> + }
> +
> + if (strncmp("auto", buf, strlen("auto")) == 0) {
> + /* Make sure we are in LOW PLL when changing modes */
> + if (hdev->pm_mng_profile == PM_MANUAL) {
> + atomic_set(&hdev->curr_pll_profile, PLL_HIGH);
> + hl_device_set_frequency(hdev, PLL_LOW);
> + hdev->pm_mng_profile = PM_AUTO;
> + }
> + } else if (strncmp("manual", buf, strlen("manual")) == 0) {
> + /* Make sure we are in LOW PLL when changing modes */
> + if (hdev->pm_mng_profile == PM_AUTO) {
> + flush_delayed_work(&hdev->work_freq);
> + hdev->pm_mng_profile = PM_MANUAL;
> + }
> + } else {
> + dev_err(hdev->dev, "value should be auto or manual\n");
> + count = -EINVAL;
> + goto unlock_mutex;
> + }
> +
> +unlock_mutex:
> + mutex_unlock(&hdev->device_open);
> +out:
> + return count;
> +}
> +
> +static ssize_t high_pll_show(struct device *dev, struct device_attribute *attr,
> + char *buf)
> +{
> + struct hl_device *hdev = dev_get_drvdata(dev);
> +
> + if (hdev->disabled)
> + return -ENODEV;
> +
> + return snprintf(buf, PAGE_SIZE, "%u\n", hdev->high_pll);
> +}
> +
> +static ssize_t high_pll_store(struct device *dev, struct device_attribute *attr,
> + const char *buf, size_t count)
> +{
> + struct hl_device *hdev = dev_get_drvdata(dev);
> + long value;
> + int rc;
> +
> + if (hdev->disabled) {
> + count = -ENODEV;
> + goto out;
> + }
> +
> + rc = kstrtoul(buf, 0, &value);
> +
> + if (rc) {
> + count = -EINVAL;
> + goto out;
> + }
> +
> + hdev->high_pll = value;
> +
> +out:
> + return count;
> +}
> +
> +static ssize_t uboot_ver_show(struct device *dev, struct device_attribute *attr,
> + char *buf)
> +{
> + struct hl_device *hdev = dev_get_drvdata(dev);
> +
> + return snprintf(buf, PAGE_SIZE, "%s\n", hdev->asic_prop.uboot_ver);
> +}
> +
> +static ssize_t armcp_kernel_ver_show(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct hl_device *hdev = dev_get_drvdata(dev);
> +
> + return snprintf(buf, PAGE_SIZE, "%s",
> + hdev->asic_prop.armcp_info.kernel_version);
> +}
> +
> +static ssize_t armcp_ver_show(struct device *dev, struct device_attribute *attr,
> + char *buf)
> +{
> + struct hl_device *hdev = dev_get_drvdata(dev);
> +
> + return snprintf(buf, PAGE_SIZE, "%s\n",
> + hdev->asic_prop.armcp_info.armcp_version);
> +}
> +
> +static ssize_t cpld_ver_show(struct device *dev, struct device_attribute *attr,
> + char *buf)
> +{
> + struct hl_device *hdev = dev_get_drvdata(dev);
> +
> + return snprintf(buf, PAGE_SIZE, "0x%08x\n",
> + hdev->asic_prop.armcp_info.cpld_version);
> +}
> +
> +static ssize_t infineon_ver_show(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct hl_device *hdev = dev_get_drvdata(dev);
> +
> + return snprintf(buf, PAGE_SIZE, "0x%04x\n",
> + hdev->asic_prop.armcp_info.infineon_version);
> +}
> +
> +static ssize_t fuse_ver_show(struct device *dev, struct device_attribute *attr,
> + char *buf)
> +{
> + struct hl_device *hdev = dev_get_drvdata(dev);
> +
> + return snprintf(buf, PAGE_SIZE, "%s\n",
> + hdev->asic_prop.armcp_info.fuse_version);
> +}
> +
> +static ssize_t thermal_ver_show(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct hl_device *hdev = dev_get_drvdata(dev);
> +
> + return snprintf(buf, PAGE_SIZE, "%s",
> + hdev->asic_prop.armcp_info.thermal_version);
> +}
> +
> +static ssize_t preboot_btl_ver_show(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct hl_device *hdev = dev_get_drvdata(dev);
> +
> + return snprintf(buf, PAGE_SIZE, "%s\n", hdev->asic_prop.preboot_ver);
> +}
> +
> +static ssize_t device_type_show(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct hl_device *hdev = dev_get_drvdata(dev);
> + char *str;
> +
> + switch (hdev->asic_type) {
> + case ASIC_GOYA:
> + str = "GOYA";
> + break;
> + default:
> + dev_err(hdev->dev, "Unrecognized ASIC type %d\n",
> + hdev->asic_type);
> + return -EINVAL;
> + }
> +
> + return snprintf(buf, PAGE_SIZE, "%s\n", str);
> +}
> +
> +static ssize_t pci_addr_show(struct device *dev, struct device_attribute *attr,
> + char *buf)
> +{
> + struct hl_device *hdev = dev_get_drvdata(dev);
> +
> + /* Use dummy, fixed address for simulator */
> + if (!hdev->pdev)
> + return snprintf(buf, PAGE_SIZE, "0000:%02d:00.0\n", hdev->id);
> +
> + return snprintf(buf, PAGE_SIZE, "%04x:%02x:%02x.%x\n",
> + pci_domain_nr(hdev->pdev->bus),
> + hdev->pdev->bus->number,
> + PCI_SLOT(hdev->pdev->devfn),
> + PCI_FUNC(hdev->pdev->devfn));
> +}
> +
> +static ssize_t status_show(struct device *dev, struct device_attribute *attr,
> + char *buf)
> +{
> + struct hl_device *hdev = dev_get_drvdata(dev);
> + char *str;
> +
> + if (hdev->disabled)
> + str = "Malfunction";
> + else
> + str = "Operational";
> +
> + return snprintf(buf, PAGE_SIZE, "%s\n", str);
> +}
> +
> +static ssize_t write_open_cnt_show(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct hl_device *hdev = dev_get_drvdata(dev);
> +
> + return snprintf(buf, PAGE_SIZE, "%d\n", hdev->user_ctx ? 1 : 0);
> +}
> +
> +static ssize_t max_power_show(struct device *dev, struct device_attribute *attr,
> + char *buf)
> +{
> + struct hl_device *hdev = dev_get_drvdata(dev);
> + long val;
> +
> + if (hdev->disabled)
> + return -ENODEV;
> +
> + val = hl_get_max_power(hdev);
> +
> + return snprintf(buf, PAGE_SIZE, "%lu\n", val);
> +}
> +
> +static ssize_t max_power_store(struct device *dev,
> + struct device_attribute *attr, const char *buf, size_t count)
> +{
> + struct hl_device *hdev = dev_get_drvdata(dev);
> + unsigned long value;
> + int rc;
> +
> + if (hdev->disabled) {
> + count = -ENODEV;
> + goto out;
> + }
> +
> + rc = kstrtoul(buf, 0, &value);
> +
> + if (rc) {
> + count = -EINVAL;
> + goto out;
> + }
> +
> + hdev->max_power = value;
> + hl_set_max_power(hdev, value);
> +
> +out:
> + return count;
> +}
> +
> +static ssize_t eeprom_read_handler(struct file *filp, struct kobject *kobj,
> + struct bin_attribute *attr, char *buf, loff_t offset,
> + size_t max_size)
> +{
> + struct device *dev = container_of(kobj, struct device, kobj);
> + struct hl_device *hdev = dev_get_drvdata(dev);
> + char *data;
> + int rc;
> +
> + if (!max_size)
> + return -EINVAL;
> +
> + data = kzalloc(max_size, GFP_KERNEL);
> + if (!data)
> + return -ENOMEM;
> +
> + rc = hdev->asic_funcs->get_eeprom_data(hdev, data, max_size);
> + if (rc)
> + goto out;
> +
> + memcpy(buf, data, max_size);
> +
> +out:
> + kfree(data);
> +
> + return max_size;
> +}
> +
> +static DEVICE_ATTR_RW(pm_mng_profile);
> +static DEVICE_ATTR_RW(high_pll);
> +static DEVICE_ATTR_RO(uboot_ver);
> +static DEVICE_ATTR_RO(armcp_kernel_ver);
> +static DEVICE_ATTR_RO(armcp_ver);
> +static DEVICE_ATTR_RO(cpld_ver);
> +static DEVICE_ATTR_RO(infineon_ver);
> +static DEVICE_ATTR_RO(fuse_ver);
> +static DEVICE_ATTR_RO(thermal_ver);
> +static DEVICE_ATTR_RO(preboot_btl_ver);
> +static DEVICE_ATTR_RO(device_type);
> +static DEVICE_ATTR_RO(pci_addr);
> +static DEVICE_ATTR_RO(status);
> +static DEVICE_ATTR_RO(write_open_cnt);
> +static DEVICE_ATTR_RW(max_power);
> +
> +static const struct bin_attribute bin_attr_eeprom = {
> + .attr = {.name = "eeprom", .mode = (0444)},
> + .size = PAGE_SIZE,
> + .read = eeprom_read_handler
> +};
> +
> +int hl_sysfs_init(struct hl_device *hdev)
> +{
> + int rc;
> +
> + rc = hdev->asic_funcs->add_device_attr(hdev);
> + if (rc) {
> + dev_err(hdev->dev, "failed to add device attributes\n");
> + return rc;
> + }
> +
> + rc = device_create_file(hdev->dev, &dev_attr_pm_mng_profile);
> + if (rc) {
> + dev_err(hdev->dev,
> + "failed to create device file pm_mng_profile\n");
> + goto remove_device_attr;
> + }
> +
> + hdev->pm_mng_profile = PM_AUTO;
> +
> + rc = device_create_file(hdev->dev, &dev_attr_high_pll);
> + if (rc) {
> + dev_err(hdev->dev,
> + "failed to create device file pll_profile\n");
> + goto remove_pm_mng_profile;
> + }
> +
> + rc = device_create_file(hdev->dev, &dev_attr_uboot_ver);
> + if (rc) {
> + dev_err(hdev->dev, "failed to create device file uboot_ver\n");
> + goto remove_pll_profile;
> + }
> +
> + rc = device_create_file(hdev->dev, &dev_attr_armcp_kernel_ver);
> + if (rc) {
> + dev_err(hdev->dev,
> + "failed to create device file armcp_kernel_ver\n");
> + goto remove_uboot_ver;
> + }
> +
> + rc = device_create_file(hdev->dev, &dev_attr_armcp_ver);
> + if (rc) {
> + dev_err(hdev->dev, "failed to create device file armcp_ver\n");
> + goto remove_armcp_kernel_ver;
> + }
> +
> + rc = device_create_file(hdev->dev, &dev_attr_cpld_ver);
> + if (rc) {
> + dev_err(hdev->dev, "failed to create device file cpld_ver\n");
> + goto remove_armcp_ver;
> + }
> +
> + rc = device_create_file(hdev->dev, &dev_attr_infineon_ver);
> + if (rc) {
> + dev_err(hdev->dev,
> + "failed to create device file infineon_ver\n");
> + goto remove_cpld_ver;
> + }
> +
> + rc = device_create_file(hdev->dev, &dev_attr_fuse_ver);
> + if (rc) {
> + dev_err(hdev->dev, "failed to create device file fuse_ver\n");
> + goto remove_infineon_ver;
> + }
> +
> + rc = device_create_file(hdev->dev, &dev_attr_thermal_ver);
> + if (rc) {
> + dev_err(hdev->dev, "failed to create device file thermal_ver\n");
> + goto remove_fuse_ver;
> + }
> +
> + rc = device_create_file(hdev->dev, &dev_attr_preboot_btl_ver);
> + if (rc) {
> + dev_err(hdev->dev,
> + "failed to create device file preboot_btl_ver\n");
> + goto remove_thermal_ver;
> + }
> +
> + rc = device_create_file(hdev->dev, &dev_attr_device_type);
> + if (rc) {
> + dev_err(hdev->dev,
> + "failed to create device file device_type\n");
> + goto remove_preboot_ver;
> + }
> +
> + rc = device_create_file(hdev->dev, &dev_attr_pci_addr);
> + if (rc) {
> + dev_err(hdev->dev, "failed to create device file pci_addr\n");
> + goto remove_device_type;
> + }
> +
> + rc = device_create_file(hdev->dev, &dev_attr_status);
> + if (rc) {
> + dev_err(hdev->dev, "failed to create device file status\n");
> + goto remove_pci_addr;
> + }
> +
> + rc = device_create_file(hdev->dev, &dev_attr_write_open_cnt);
> + if (rc) {
> + dev_err(hdev->dev,
> + "failed to create device file write_open_count\n");
> + goto remove_status;
> + }
> +
> + hdev->max_power = hdev->asic_prop.max_power_default;
> +
> + rc = device_create_file(hdev->dev, &dev_attr_max_power);
> + if (rc) {
> + dev_err(hdev->dev,
> + "failed to create device file max_power\n");
> + goto remove_write_open_cnt;
> + }
> +
> + rc = sysfs_create_bin_file(&hdev->dev->kobj, &bin_attr_eeprom);
> + if (rc) {
> + dev_err(hdev->dev, "failed to create EEPROM sysfs entry\n");
> + goto remove_attr_max_power;
> + }
> +
> + return 0;
> +
> +remove_attr_max_power:
> + device_remove_file(hdev->dev, &dev_attr_max_power);
> +remove_write_open_cnt:
> + device_remove_file(hdev->dev, &dev_attr_write_open_cnt);
> +remove_status:
> + device_remove_file(hdev->dev, &dev_attr_status);
> +remove_pci_addr:
> + device_remove_file(hdev->dev, &dev_attr_pci_addr);
> +remove_device_type:
> + device_remove_file(hdev->dev, &dev_attr_device_type);
> +remove_preboot_ver:
> + device_remove_file(hdev->dev, &dev_attr_preboot_btl_ver);
> +remove_thermal_ver:
> + device_remove_file(hdev->dev, &dev_attr_thermal_ver);
> +remove_fuse_ver:
> + device_remove_file(hdev->dev, &dev_attr_fuse_ver);
> +remove_infineon_ver:
> + device_remove_file(hdev->dev, &dev_attr_infineon_ver);
> +remove_cpld_ver:
> + device_remove_file(hdev->dev, &dev_attr_cpld_ver);
> +remove_armcp_ver:
> + device_remove_file(hdev->dev, &dev_attr_armcp_ver);
> +remove_armcp_kernel_ver:
> + device_remove_file(hdev->dev, &dev_attr_armcp_kernel_ver);
> +remove_uboot_ver:
> + device_remove_file(hdev->dev, &dev_attr_uboot_ver);
> +remove_pll_profile:
> + device_remove_file(hdev->dev, &dev_attr_high_pll);
> +remove_pm_mng_profile:
> + device_remove_file(hdev->dev, &dev_attr_pm_mng_profile);
> +remove_device_attr:
> + hdev->asic_funcs->remove_device_attr(hdev);
> +
> + return rc;
> +}
> +
> +void hl_sysfs_fini(struct hl_device *hdev)
> +{
> + sysfs_remove_bin_file(&hdev->dev->kobj, &bin_attr_eeprom);
> + device_remove_file(hdev->dev, &dev_attr_max_power);
> + device_remove_file(hdev->dev, &dev_attr_write_open_cnt);
> + device_remove_file(hdev->dev, &dev_attr_status);
> + device_remove_file(hdev->dev, &dev_attr_pci_addr);
> + device_remove_file(hdev->dev, &dev_attr_device_type);
> + device_remove_file(hdev->dev, &dev_attr_preboot_btl_ver);
> + device_remove_file(hdev->dev, &dev_attr_thermal_ver);
> + device_remove_file(hdev->dev, &dev_attr_fuse_ver);
> + device_remove_file(hdev->dev, &dev_attr_infineon_ver);
> + device_remove_file(hdev->dev, &dev_attr_cpld_ver);
> + device_remove_file(hdev->dev, &dev_attr_armcp_ver);
> + device_remove_file(hdev->dev, &dev_attr_armcp_kernel_ver);
> + device_remove_file(hdev->dev, &dev_attr_uboot_ver);
> + device_remove_file(hdev->dev, &dev_attr_high_pll);
> + device_remove_file(hdev->dev, &dev_attr_pm_mng_profile);
> + hdev->asic_funcs->remove_device_attr(hdev);
> +}
> --
> 2.17.1
>

--
Sincerely yours,
Mike.