|
51 | 51 | #include <sys/trace_zfs.h> |
52 | 52 |
|
53 | 53 | /* |
54 | | - * This file contains the necessary logic to remove vdevs from a |
55 | | - * storage pool. Currently, the only devices that can be removed |
56 | | - * are log, cache, and spare devices; and top level vdevs from a pool |
57 | | - * w/o raidz or mirrors. (Note that members of a mirror can be removed |
58 | | - * by the detach operation.) |
| 54 | + * This file contains the necessary logic to remove vdevs from a storage |
| 55 | + * pool. Note that members of a mirror can be removed by the detach |
| 56 | + * operation. Currently, the only devices that can be removed are: |
59 | 57 | * |
60 | | - * Log vdevs are removed by evacuating them and then turning the vdev |
61 | | - * into a hole vdev while holding spa config locks. |
| 58 | + * 1) Traditional hot spare and cache vdevs. Note that draid distributed |
| 59 | + * spares are fixed at creation time and cannot be removed. |
62 | 60 | * |
63 | | - * Top level vdevs are removed and converted into an indirect vdev via |
64 | | - * a multi-step process: |
| 61 | + * 2) Log vdevs are removed by evacuating them and then turning the vdev |
| 62 | + * into a hole vdev while holding spa config locks. |
65 | 63 | * |
66 | | - * - Disable allocations from this device (spa_vdev_remove_top). |
| 64 | + * 3) Top-level singleton and mirror vdevs, including dedup and special |
| 65 | + * vdevs, are removed and converted into an indirect vdev via a |
| 66 | + * multi-step process: |
67 | 67 | * |
68 | | - * - From a new thread (spa_vdev_remove_thread), copy data from |
69 | | - * the removing vdev to a different vdev. The copy happens in open |
70 | | - * context (spa_vdev_copy_impl) and issues a sync task |
71 | | - * (vdev_mapping_sync) so the sync thread can update the partial |
72 | | - * indirect mappings in core and on disk. |
| 68 | + * - Disable allocations from this device (spa_vdev_remove_top). |
73 | 69 | * |
74 | | - * - If a free happens during a removal, it is freed from the |
75 | | - * removing vdev, and if it has already been copied, from the new |
76 | | - * location as well (free_from_removing_vdev). |
| 70 | + * - From a new thread (spa_vdev_remove_thread), copy data from the |
| 71 | + * removing vdev to a different vdev. The copy happens in open context |
| 72 | + * (spa_vdev_copy_impl) and issues a sync task (vdev_mapping_sync) so |
| 73 | + * the sync thread can update the partial indirect mappings in core |
| 74 | + * and on disk. |
77 | 75 | * |
78 | | - * - After the removal is completed, the copy thread converts the vdev |
79 | | - * into an indirect vdev (vdev_remove_complete) before instructing |
80 | | - * the sync thread to destroy the space maps and finish the removal |
81 | | - * (spa_finish_removal). |
| 76 | + * - If a free happens during a removal, it is freed from the removing |
| 77 | + * vdev, and if it has already been copied, from the new location as |
| 78 | + * well (free_from_removing_vdev). |
| 79 | + * |
| 80 | + * - After the removal is completed, the copy thread converts the vdev |
| 81 | + * into an indirect vdev (vdev_remove_complete) before instructing |
| 82 | + * the sync thread to destroy the space maps and finish the removal |
| 83 | + * (spa_finish_removal). |
| 84 | + * |
| 85 | + * The following constraints currently apply primary device removal: |
| 86 | + * |
| 87 | + * - All vdevs must be online, healthy, and not be missing any data |
| 88 | + * according to the DTLs. |
| 89 | + * |
| 90 | + * - When removing a singleton or mirror vdev, regardless of it's a |
| 91 | + * special, dedup, or primary device, it must have the same ashift |
| 92 | + * as the devices in the normal allocation class. Furthermore, all |
| 93 | + * vdevs in the normal allocation class must have the same ashift to |
| 94 | + * ensure the new allocations never includes additional padding. |
| 95 | + * |
| 96 | + * - The normal allocation class cannot contain any raidz or draid |
| 97 | + * top-level vdevs since segments are copied without regard for block |
| 98 | + * boundaries. This makes it impossible to calculate the required |
| 99 | + * parity columns when using these vdev types as the destination. |
| 100 | + * |
| 101 | + * - The encryption keys must be loaded so the ZIL logs can be reset |
| 102 | + * in order to prevent writing to the device being removed. |
| 103 | + * |
| 104 | + * N.B. ashift and raidz/draid constraints for primary top-level device |
| 105 | + * removal could be slightly relaxed if it were possible to request that |
| 106 | + * DVAs from a mirror or singleton in the specified allocation class be |
| 107 | + * used (metaslab_alloc_dva). |
| 108 | + * |
| 109 | + * This flexibility would be particularly useful for raidz/draid pools which |
| 110 | + * often include a mirrored special device. If a mistakenly added top-level |
| 111 | + * singleton were added it could then still be removed at the cost of some |
| 112 | + * special device capacity. This may be a worthwhile tradeoff depending on |
| 113 | + * the pool capacity and expense (cost, complexity, time) of creating a new |
| 114 | + * pool and copying all of the data to correct the configuration. |
| 115 | + * |
| 116 | + * Furthermore, while not currently supported it should be possible to allow |
| 117 | + * vdevs of any type to be removed as long as they've never been written to. |
82 | 118 | */ |
83 | 119 |
|
84 | 120 | typedef struct vdev_copy_arg { |
|
0 commit comments