Releases: Mellanox/hw-mgmt
V.7.0040.1003
update /asic1 in addition to asic and asic2
V.7.0040.1002
Modify kconfig from downstream to upstream
V.7.0040.1000
================================================================================
- V.7.0040.1000
- Sun , 30 June 2024
-
New features
o Add support for QM3400 Blackmamba - ES level quality
o Add support for SN4280 SmartSwitch Bobcat - ES level quality
o Add support for N5110_LD Juliet Scaleout PO + TTM - ES level quality
o Add support in VPD parser for System VPD vendor specific SSD SED PSID block -
Bug fixes
#3649551 SN4700 : [Independent Module] | on r-leopard-41 with IM enabled, there was a thermal overload.
#3878328 SN4700 : Switch rebooted with "Thermal Overload" because ASIC thermal is not available
#3885405 TC: [Thermal Algorithm] | Blacklist is malfunctions
#3883147 TC: [Thermal Algorithm] | Counts errors even it was paused by black list
#3879220 SN3420 : Thermal control: increase PWM minimum speed (20%->25%) to work around fan state issue reported by smond
#3895891 SPC1: [systemctl is-system-running] | SPC1 stuck in starting state after config reload - System was not started – lmsensor dependency issue
#3900159 QM3400: [Kernel 6.1] thermal/module#temp_crit: Input/output error
#3900138 QM3400: [Kernel 6.1] Can't get value of subfeature temp input for front panel
#3882472 QM3000 | QM3400: Mismatch system names in TC config (qm3400 instead q3200)
#3948113 Switch is freezing after generating hw-mgmt dump few times in row
#3852236 ARM: Kernel oops symptoms after boot: Unable to handle kernel paging address xxx when BSP Drivers are used
NA msn5400 | msn5600 | sn4280 :TC: fix asic sensor mask in sensor_parameters
NA vpd parser: Sanity check is done only for 'MLNX' fru types
NA Multi ASIC system: kernel config CONFIG_HOTPLUG_PCI_PCIE (kernel 6.1) is required to be disabled for the sw_reset on multi asic systems
NA TC: missing support of correct PWM calculation for systems with amb{X} sensor count != 2
NA MSN4700 | MQM9520 :Some PSU1 labels are incorrectly marked as PSU2.
NA QM3000 | QM3400 :voltmon1 and voltmon4 symlinks pointing to curr2 sensors Instead of curr3 sensors.
NA QM3000 : ASIC PCIE mapping was wrong
NA Deployment tool : Missing support for Kconfig per Kernel major version
NA vpd-parser: In case onie "Base MAC Address" filed ends with zero byte - vpd-parser cut last byte in output.o For detailed patch list: Please view: https://github.com/Mellanox/hw-mgmt/blob/V.7.0040.1000_BR/recipes-kernel/linux/Patch_Status_Table.txt
-
Known issues and limitations:
o Systems like sn2700 which contain delta 460 PSU may have "Error getting sensor data: dps460/#25: Can't read"
which is a temporary inaccessibility of certain alarm attributes read from the PSU.
o Systems may show a message of WARNING kernel: … supply vcc not found, using dummy regulator"
o Systems SN2010, SN2100, SN2410, SN2700 and SN2740 (and their "-B" variants) require the following flag in kernel cmdline:
"acpi_enforce_resources=lax acpi=noirq".
================================================================================
V.7.0040.0033
Issue Title
#3649551 SN4700 : [Independent Module] | on r-leopard-41 with IM enabled, there was a thermal overload.
#3878328 SN4700 : Switch rebooted with "Thermal Overload" because ASIC thermal is not available
#3885405 TC: [Thermal Algorithm] | Blacklist is malfunctions
#3883147 TC: [Thermal Algorithm] | Counts errors even it was paused by black list
#3879220 SN3420 : Thermal control: increase PWM minimum speed (20%->25%) to work around fan state issue reported by smond
#3895891 SPC1: [systemctl is-system-running] | SPC1 stuck in starting state after config reload - System was not started – lmsensor dependency issue
#3900159 QM3400: Kernel 6.1] thermal/module#temp_crit: Input/output error
#3900138 QM3400: [Kernel 6.1] Can't get value of subfeature temp input for front panel
#3882472 QM3000 | QM3400: Mismatch system names in TC config (qm3400 instead q3200)
#3948113 Switch is freezing after generating hw-mgmt dump few times in row
NA msn5400 | msn5600 | sn4280 :TC: fix asic sensor mask in sensor_parameters
NA Fix vpd parser sanity check is done only for 'MLNX' fru types
NA Multi ASIC system: kernel config CONFIG_HOTPLUG_PCI_PCIE (kernel 6.1) is required to be disabled for the sw_reset on multi asic systems
NA TC: missing support of correct PWM calculation for systems with amb{X} sensor count != 2
NA MSN4700 | MQM9520:Some PSU1 labels are incorrectly marked as PSU2.
NA QM3000 | QM3400: voltmon1 and voltmon4 symlinks pointing to curr2 sensors Instead of curr3 sensors.
NA QM3000 ASIC PCIE mapping was wrong
NA Missing support for Kconfig per Kernel major version
V.7.0040.0031
Update changelog V.7.0040.0031 V.7.0040.0031
V.7.0030.2300
================================================================================
- V.7.0030.2300
- Tue , 28 May 2023
-
Bug fixes
Issue Title
#3706151 [MSN4600-VS2RC] : Fans running at high speed when PN of switch is SSG7B27990
#3649551 [SPC2|SPC3|QM1|QM2|QM3] TC - dynamic minimum table RPM values for "sensor_read_error" were too low causing system hitting up
#3651819 [SN2410] in systems with customer-adjusted PN, the fan direction is not recognized , it appears wrong and causing faulty TC behaviour
#3723906 systems occasionally report errors for FAN2 & FAN3 although only PSU FAN1 exists
#3733632 [MSN2410|MSN2100|MSN2010] sensors.conf "Chassis Fan Drawer x fan y" labels were mistakly defined as "fan z"
#3726901 [MSN27002] voltmon6 mistakenly appearing in sensor_list at tc_config.json file causing ERROR of reading file voltmon6_temp
#3748535 [SP1|SPC2] chipup timeout is too short over legacy system causing sometimes failure
NA When I2C device’s 1st probe fails there is no retry performed.
NA Hw mgmt started handling udev events before basic hw mgmt initialization is done
NA udev events were handled randomly by hw mgmt since udev settle command was missed in hw mgmt init
NA [MSN5400|MSN5600|QM3400] TC – systems with a single Fan direction always considered in TC to be P2C
NA [MSN5600|MSN5400] sensors: mistakes in labels of sensor conf.
NA [MSN5600|MSN5400] sensors: missed sensors config rule for 2nd PSU
NA [MSN3420] symlink mistakenly pointing to MSN3700 sensors config fileo For detailed patch list: Please view: https://github.com/Mellanox/hw-mgmt/blob/V.7.0030.2300_BR/recipes-kernel/linux/Patch_Status_Table.txt
-
Known issues and limitations:
o Systems like sn2700 which contain delta 460 PSU may have "Error getting sensor data: dps460/#25: Can't read"
which is a temporary inaccessibility of certain alarm attributes read from the PSU
o Patch 0181-Revert-Fix-out-of-bounds-memory-accesses-in-thermal.patch should be applied
for kernel >= 5.10.74 only, to avoid thermal control interface issues
o This version disables system reset in thermal algorithm
o Kernel patch 4.9 #60 is available upstream from kernel 4.9.207 and
Kernel patch 4.19 #28 is available upstream from kernel 4.19.89.
- No need to apply these patches when working with these kernel versions
or above
o ethtool for QSFP-DD is working only in raw mode.
o SN4700 PSU (Murata) sensors PSU2 and PSU3 might be not available after insertion/removal.
o PSUs inventory read via PMBus require the following packages:
- i2c-tools_4.1-1_amd64.deb
- libi2c0_4.1-1_amd64.deb
o I2C Asic driver take up to 5 second to complete initialization. When
sending ADD even need to make sure to wait at list 5 second before
reset of ASIC.
o Systems SN2010, SN2100, SN2410, SN2700 and SN2740 (and their
"-B" variants) require the following flag in kernel cmdline:
"acpi_enforce_resources=lax acpi=noirq".
o Few bug fixes introduced in upstream kernel 4.19, whoever use older
v4.19 kernel then v4.19.58, should cherry pick the following commits:
- Fix wrong order in probing routine initialization:
d2d8f64012543898a0158b1fc5c07af3d41c89d8 (available in v4.19.49)
- Fix parent device in i2c-mux-reg device registration
c241f3fbfa1af86f572a92f2e4d708358e163806 (available in v4.19.58)
o Kernel patch 4.9 #37 is available upstream from kernel 4.9.197 and
Kernel patch 4.19 #9 is available upstream from kernel 4.19.79.
- No need to apply these patches when working with these kernel versions
or above
o This version requires FW version 29.2000.1886 or higher for spectrum-2
and 13.2000.1886 or higher for spectrum-1.
V.7.0030.3992
Update changelog V.7.0030.3992
V.7.0030.4001
Update hw mgmt. deployment tool to include flags parameter option for each sub-Sonic noses (NVOS and DVS) in addition to existing Sonic flag
V.7.0030.3985.bobcat
V.7.0030.3985.bobcat
V.7.0030.4050
V.7.0030.4050