Re: [PATCH v5 12/15] misc: bcm-vk: add sysfs interface

From: Florian Fainelli
Date: Wed Sep 30 2020 - 22:30:51 EST




On 9/30/2020 6:28 PM, Scott Branden wrote:
Add sysfs support to query the VK card status and monitor sense points.
The vk-card-status and vk-card-mon details are provided in the README
file in the bcm-vk driver directory.

Co-developed-by: Desmond Yan <desmond.yan@xxxxxxxxxxxx>
Signed-off-by: Desmond Yan <desmond.yan@xxxxxxxxxxxx>
Signed-off-by: Scott Branden <scott.branden@xxxxxxxxxxxx>
---

[snip]

+vk-card-status/
+ bus ---> device PCIe bus
+ card_state ---> summary of current card states
+ chip_id
+ firmware_status ---> summary of all firmware status
+ firmware_version ---> summary of all firmware versions
+ freq_core_mhz ---> running frequency in mHz
+ freq_mem_mhz ---> memory frequency in mHz
+ mem_size_mb ---> memory size in MByte
+ os_state ---> current running state
+ reset_reason ---> last reset reason
+ rev_boot1 ---> boot1 firmware revision
+ rev_boot2 ---> boot2 firmware revision
+ rev_driver ---> host driver revision
+ rev_flash_rom ---> Flash ROM revision
+ sotp_boot1_rev_id ---> minimum boot1 revision required
+ sotp_boot2_rev_id ---> minimum boot2 revision required
+ sotp_dauth_1 ---> authentication key hash
+ sotp_dauth_1_valid ---> authentication key validity
+ sotp_dauth_1_active_status -> authentication key active or not
+ sotp_dauth_2
+ sotp_dauth_2_valid
+ sotp_dauth_2_active_status
+ sotp_dauth_3
+ sotp_dauth_3_valid
+ sotp_dauth_3_active_status
+ sotp_dauth_4
+ sotp_dauth_4_valid
+ sotp_dauth_4_active_status
+ temp_threshold_lower_c ---> thermal low threshold in Celsius
+ temp_threshold_upper_c ---> thermal high threshold in Celsius
+ uptime_s ---> os up time in seconds
+
+vk-card-mon/
+ alert_afbc_busy ---> AFBC block stuck
+ alert_ecc ---> uncorrectable ECC error(s) occurred
+ alert_ecc_warn ---> correctable ECC error(s) occurred
+ alert_heartbeat_fail ---> host detects heartbeat discontinuation
+ from card
+ alert_high_temp ---> high temperature threshold crossed
+ alert_intf_ver_fail ---> interface not compatible based on version
+ alert_low_temp_warn ---> low temperature threshold crossed
+ alert_malloc_fail_warn ---> mem allocation failure(s) occurred
+ alert_pcie_down ---> host detect pcie intf going down
+ alert_ssim_busy ---> ssim block busy
+ alert_sys_fault ---> system fault
+ alert_wdog_timeout ---> watchdog timeout

Almost all of these should be supported using the HWMON framework instead of custom attributes that do not follow the HWMON naming conventions.

+ firmware_status_reg
+ mem_ecc ---> correctable ECC error count
+ mem_uecc ---> uncorrectable ECC error count

Implementing an EDAC driver would be a bit overkill unless you have a way to signal ECC errors towards the host?

+ boot_status_reg ---> boot status of card
+ pwr_state ---> power state, 1-full, 2-reduced, 3-lowest
+ temperature_sensor_1_c ---> CPU die temperature in Celsius
+ temperature_sensor_2_c ---> DDR0 temperature in Celsius
+ temperature_sensor_3_c ---> DDR1 temperature in Celsius

Likewise.

+ utilization ---> runtime video transcoding consumption summary
+ utilization_pix ---> percentage of pixel processing used
+ utilization_pix_used ---> pixel processing used
+ utilization_pix_max ---> max pixel processing value which maps 100% load
+ utilization_codec ---> percentage of codec sessions used
+ utilization_codec_used ---> codec sessions currently used
+ utilization_codec_max ---> max codec sessions allowed
+ voltage_18_mv ---> 1.8v voltage rail in mv
+ voltage_33_mv ---> 3.3v voltage rail in mv

Likewise

+
+The sysfs entry supports only the read operation.

entries.
--
Florian