Skip to content

Commit eb45e4d

Browse files
committed
docs: firmware upgrades on off2
1 parent 9129f7b commit eb45e4d

File tree

3 files changed

+70
-0
lines changed

3 files changed

+70
-0
lines changed
Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
# 2024-12-11 Firmware upgrade on off2
2+
3+
We noticed that on off2, the latency was really high for two of the hard drives,
4+
namely sdc and sdd.
5+
6+
After some research, Christian found that it could be linked to an old firmware on the disks.
7+
8+
## Preparation
9+
10+
We can see firmware version by using (here for sdd):
11+
```bash
12+
smartctl -a /dev/sdd|grep -i firmware
13+
```
14+
It was `GB01` on the two culprits, while it was `GB03` on the others.
15+
16+
Christian downloaded the firmware update for the disks,
17+
it comes as an installer: `Serial-ATA_Firmware_0942Y_LN_GB03_A00.BIN`
18+
19+
You can find it by using the "Device Model" entry from `smartctl -a` (here `TOSHIBA MG07ACA14TEY`)
20+
and search for firmwaon the flyre. You land on `https://www.dell.com/support/home/fr-fr/drivers/driversdetails?driverid=0942y`
21+
22+
23+
## Changing the firmware
24+
25+
The firmware update is happening on all disk on the server.
26+
27+
We just run the program as root and it went smoothly.
28+
29+
But we wanted to be sure that the firmware upgrade was taken into account
30+
(we didn't have the time to wait for a check,
31+
and where not sure of the figure seen with `zpool iostat`).
32+
33+
So we decided to remove and re-add the disks to the server.
34+
Because our ZFS is raidz we can remove one disk at a time without losing data,
35+
and because we will remove it only for a short period, re-silver will be fast.
36+
37+
So we removed sdd from ZFS pools
38+
```bash
39+
zpool offline zfs-hdd sdd4
40+
# for rpool the name is different, but we can recognize thanks to the serial number (smartctl -a gives it)
41+
zpool offline rpool ata-TOSHIBA_MG07ACA14TEY_X8F0A0H8F9RG-part3
42+
```
43+
44+
We unplugged the disk and plugged it back in.
45+
46+
Then we re-added it to the ZFS pools:
47+
```bash
48+
zpool online zfs-hdd sdd4
49+
zpool online rpool ata-TOSHIBA_MG07ACA14TEY_X8F0A0H8F9RG-part3
50+
```
51+
52+
We then wait for re-silver to be completed, zpool status displays if a resilver is in progress or done:
53+
```bash
54+
zpool status
55+
```
56+
It was almost instantaneous.
57+
58+
We did the same operation for the other disk, `sdc`.
59+
60+
Note: we did not offlined the disk from rpool because we forgot to do it. It went well the same,
61+
because ZFS was resilient enough and did the resilver automatically.
62+
63+
64+
## Result
65+
66+
The result is easy to see in munin:
67+
68+
![disk latency graph](./media/2024-12-11-munin-off2-latency-after-firmware-update.png)
69+
70+
![disk usage graph](./media/2024-12-11-munin-off2-usage-after-firmware-update.png)
77.1 KB
Loading
61.3 KB
Loading

0 commit comments

Comments
 (0)