Skip to content

Commit 8d9d5ed

Browse files
Add volume snapshot workflow documentation
Co-authored-by: sandeeplocharla <85344604+sandeeplocharla@users.noreply.github.com>
1 parent 51ed838 commit 8d9d5ed

File tree

1 file changed

+390
-0
lines changed

1 file changed

+390
-0
lines changed

docs/volume-snapshot-workflow.md

Lines changed: 390 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,390 @@
1+
# Volume Snapshot Workflow in CloudStack
2+
3+
This document describes the end-to-end workflow for taking volume-level snapshots from the CloudStack Management Server, organized in the sequence that CloudStack orchestrates the operation.
4+
5+
---
6+
7+
## Overview
8+
9+
A volume snapshot in CloudStack captures the state of a disk at a point in time. The snapshot can be stored on primary storage, secondary storage, or replicated across zones and storage pools. The workflow involves multiple layers: API, orchestration, storage engine, and storage-specific strategy plugins.
10+
11+
---
12+
13+
## Step-by-Step Workflow
14+
15+
### Step 1 — API Entry Point: `CreateSnapshotCmd.execute()`
16+
17+
**File:** `api/src/main/java/org/apache/cloudstack/api/command/user/snapshot/CreateSnapshotCmd.java`
18+
19+
The user (or scheduler) calls the `createSnapshot` API. The command is a `BaseAsyncCreateCmd`, meaning snapshot *allocation* and *execution* happen in two separate phases (create and execute).
20+
21+
In the `execute()` phase:
22+
23+
```java
24+
snapshot = _volumeService.takeSnapshot(
25+
getVolumeId(), getPolicyId(), getEntityId(),
26+
getAccount(), getQuiescevm(), getLocationType(),
27+
getAsyncBackup(), getTags(), getZoneIds(),
28+
getStoragePoolIds(), useStorageReplication());
29+
```
30+
31+
Key parameters available to the caller:
32+
- `volumeId` – the volume to snapshot
33+
- `policyId` – optional snapshot policy to apply
34+
- `locationType``PRIMARY` or `SECONDARY`
35+
- `asyncBackup` – whether to back up to secondary asynchronously
36+
- `zoneIds` – destination zones to copy the snapshot to
37+
- `storagePoolIds` – specific primary storage pools to copy the snapshot to
38+
- `useStorageReplication` – use native cross-zone storage replication (StorPool)
39+
40+
---
41+
42+
### Step 2 — Allocation Phase: `VolumeApiServiceImpl.allocSnapshot()`
43+
44+
**File:** `server/src/main/java/com/cloud/storage/VolumeApiServiceImpl.java`
45+
46+
Before `execute()` is called, `create()` runs `allocSnapshot()` which:
47+
48+
1. Verifies the caller has access to the volume.
49+
2. Validates resource limits (snapshot count, secondary storage quota).
50+
3. Generates a snapshot name in the format `<vmName>_<volumeName>_<timestamp>`.
51+
4. Creates a `SnapshotVO` record in the database in state `Allocated`.
52+
5. Increments resource counters for the account (snapshot count, storage size).
53+
54+
The allocation returns the snapshot ID, which is then used by the `execute()` phase.
55+
56+
---
57+
58+
### Step 3 — Validation and Path Selection: `VolumeApiServiceImpl.takeSnapshotInternal()`
59+
60+
**File:** `server/src/main/java/com/cloud/storage/VolumeApiServiceImpl.java`
61+
62+
`takeSnapshotInternal()` performs pre-flight checks before dispatching work:
63+
64+
1. Re-validates volume exists and is in `Ready` state.
65+
2. Rejects snapshots on `External` hypervisor type volumes.
66+
3. Resolves `zoneIds` and `poolIds` from snapshot policy details if a `policyId` is provided.
67+
4. Validates each destination zone exists.
68+
5. Checks that the caller has access to both the volume and (if attached) the VM.
69+
6. If the storage pool is managed and `locationType` is unset, defaults to `LocationType.PRIMARY`.
70+
7. Calls `snapshotHelper.addStoragePoolsForCopyToPrimary()` to resolve storage pool IDs when `useStorageReplication` is enabled.
71+
72+
**Path selection based on VM attachment:**
73+
74+
```
75+
Volume attached to running VM?
76+
├── YES → Serialize via VM Work Job Queue
77+
│ (Step 4a — job queue path)
78+
└── NO → Direct execution
79+
(Step 4b — direct path)
80+
```
81+
82+
---
83+
84+
### Step 4a — Serialized Execution via VM Work Job Queue
85+
86+
**File:** `server/src/main/java/com/cloud/storage/VolumeApiServiceImpl.java`
87+
88+
When the volume is attached to a VM, CloudStack serializes the operation using the VM Work Job queue. This prevents concurrent conflicting operations on the same VM.
89+
90+
```java
91+
Outcome<Snapshot> outcome = takeVolumeSnapshotThroughJobQueue(
92+
vm.getId(), volumeId, policyId, snapshotId,
93+
account.getId(), quiesceVm, locationType,
94+
asyncBackup, zoneIds, poolIds);
95+
```
96+
97+
A `VmWorkTakeVolumeSnapshot` work item is created and dispatched. The job framework eventually calls `orchestrateTakeVolumeSnapshot(VmWorkTakeVolumeSnapshot work)` from within the VM work job dispatcher.
98+
99+
If the current thread is *already* running inside the job dispatcher (re-entrant case), a placeholder work record is created and `orchestrateTakeVolumeSnapshot()` is called directly to avoid deadlock.
100+
101+
**`VmWorkTakeVolumeSnapshot` carries:**
102+
103+
```java
104+
// engine/components-api/src/main/java/com/cloud/vm/VmWorkTakeVolumeSnapshot.java
105+
new VmWorkTakeVolumeSnapshot(userId, accountId, vmId, handlerName,
106+
volumeId, policyId, snapshotId, quiesceVm,
107+
locationType, asyncBackup, zoneIds, poolIds);
108+
```
109+
110+
---
111+
112+
### Step 4b — Direct Execution (Volume Not Attached to VM)
113+
114+
**File:** `server/src/main/java/com/cloud/storage/VolumeApiServiceImpl.java`
115+
116+
When the volume is not attached to a VM, a `CreateSnapshotPayload` is built and attached directly to the volume:
117+
118+
```java
119+
CreateSnapshotPayload payload = new CreateSnapshotPayload();
120+
payload.setSnapshotId(snapshotId);
121+
payload.setSnapshotPolicyId(policyId);
122+
payload.setAccount(account);
123+
payload.setQuiescevm(quiescevm);
124+
payload.setLocationType(locationType);
125+
payload.setAsyncBackup(asyncBackup);
126+
payload.setZoneIds(zoneIds);
127+
payload.setStoragePoolIds(poolIds);
128+
129+
volume.addPayload(payload);
130+
return volService.takeSnapshot(volume);
131+
```
132+
133+
---
134+
135+
### Step 5 — Orchestration: `orchestrateTakeVolumeSnapshot()`
136+
137+
**File:** `server/src/main/java/com/cloud/storage/VolumeApiServiceImpl.java`
138+
139+
Whether coming from the job queue or directly, `orchestrateTakeVolumeSnapshot()` handles the final preparation:
140+
141+
1. Re-validates the volume is still `Ready`.
142+
2. Detects whether the volume is encrypted and on a running VM; rejects such snapshots unless the storage is StorPool (which supports live encrypted volume snapshots).
143+
3. Builds the `CreateSnapshotPayload` with all execution parameters.
144+
4. Attaches the payload to the volume.
145+
5. Calls `volService.takeSnapshot(volume)` — delegating to `SnapshotManagerImpl`.
146+
147+
**StorPool encrypted volume exception:**
148+
149+
```java
150+
boolean isSnapshotOnStorPoolOnly =
151+
volume.getStoragePoolType() == StoragePoolType.StorPool &&
152+
SnapshotInfo.BackupSnapshotAfterTakingSnapshot.value();
153+
// Allow live snapshot of encrypted volumes on StorPool primary storage
154+
```
155+
156+
---
157+
158+
### Step 6 — Strategy Selection and Snapshot Execution: `SnapshotManagerImpl.takeSnapshot()`
159+
160+
**File:** `server/src/main/java/com/cloud/storage/snapshot/SnapshotManagerImpl.java`
161+
162+
This is the core snapshot execution method:
163+
164+
1. Extracts `CreateSnapshotPayload` from the volume.
165+
2. Determines whether to use KVM file-based storage path.
166+
3. Checks if backup to secondary storage is needed for this zone.
167+
4. For KVM file-based storage with secondary backup, allocates an image store.
168+
5. Selects the appropriate `SnapshotStrategy` via `StorageStrategyFactory.getSnapshotStrategy(snapshot, TAKE)`.
169+
170+
**Strategy selection priority (highest wins):**
171+
172+
| Strategy | Priority | Handles |
173+
|---|---|---|
174+
| `StorPoolSnapshotStrategy` | HIGHEST (for DELETE/COPY) | DELETE, COPY on StorPool storage |
175+
| `StorageSystemSnapshotStrategy` | HIGH | Managed storage (TAKE, DELETE) |
176+
| `DefaultSnapshotStrategy` | DEFAULT | File-based hypervisor snapshots |
177+
| `CephSnapshotStrategy` | HIGH | Ceph RBD snapshots |
178+
| `ScaleIOSnapshotStrategy` | HIGH | ScaleIO/PowerFlex snapshots |
179+
180+
6. Calls `snapshotStrategy.takeSnapshot(snapshot)` which returns a `SnapshotInfo` on primary storage.
181+
182+
---
183+
184+
### Step 7 — Primary Storage Snapshot Creation: `SnapshotServiceImpl.takeSnapshot()`
185+
186+
**File:** `engine/storage/snapshot/src/main/java/org/apache/cloudstack/storage/snapshot/SnapshotServiceImpl.java`
187+
188+
The storage engine creates the snapshot on primary storage:
189+
190+
1. Creates a snapshot state object on the primary data store.
191+
2. Transitions snapshot state: `CreateRequested`.
192+
3. Transitions volume state: `Volume.Event.SnapshotRequested`.
193+
4. Issues an asynchronous command to the primary data store driver (`PrimaryDataStoreDriver.takeSnapshot()`).
194+
5. Waits for the async callback via `AsyncCallFuture<SnapshotResult>`.
195+
6. On success:
196+
- Updates physical size from the driver response.
197+
- Publishes `EVENT_SNAPSHOT_ON_PRIMARY` usage event.
198+
- Transitions volume: `Volume.Event.OperationSucceeded`.
199+
7. On failure:
200+
- Transitions snapshot to `OperationFailed`.
201+
- Transitions volume: `Volume.Event.OperationFailed`.
202+
203+
---
204+
205+
### Step 8 — Secondary Storage Backup Decision
206+
207+
**File:** `server/src/main/java/com/cloud/storage/snapshot/SnapshotManagerImpl.java`
208+
209+
After the snapshot is created on primary, CloudStack decides whether to back it up:
210+
211+
```
212+
BackupSnapshotAfterTakingSnapshot == true?
213+
├── YES
214+
│ ├── KVM file-based → postSnapshotDirectlyToSecondary()
215+
│ │ (snapshot already on secondary — update DB reference only)
216+
│ └── Otherwise → backupSnapshotToSecondary()
217+
│ ├── asyncBackup == true → schedule BackupSnapshotTask
218+
│ └── asyncBackup == false → synchronous backupSnapshot() + postSnapshotCreation()
219+
└── NO
220+
├── storagePoolIds provided AND asyncBackup → schedule BackupSnapshotTask for pool copy
221+
└── Otherwise → markBackedUp() (snapshot stays on primary only)
222+
```
223+
224+
**`BackupSnapshotTask`** (async retry runner):
225+
- Retries backup up to `snapshot.backup.to.secondary.retries` times.
226+
- On exhausting retries, calls `snapshotSrv.cleanupOnSnapshotBackupFailure()` to remove the snapshot record.
227+
228+
---
229+
230+
### Step 9 — StorPool Cross-Zone Snapshot Copy: `StorPoolSnapshotStrategy.copySnapshot()`
231+
232+
**File:** `plugins/storage/volume/storpool/src/main/java/org/apache/cloudstack/storage/snapshot/StorPoolSnapshotStrategy.java`
233+
234+
When `storagePoolIds` are provided and the storage is StorPool, the snapshot is replicated natively between clusters:
235+
236+
1. **Export** the snapshot from the local StorPool cluster to the remote location using `snapshotExport()`.
237+
2. **Persist recovery information** in `snapshot_details` table with the exported name and location, so that partial cross-zone copies can be recovered.
238+
3. **Copy from remote** on the destination StorPool cluster using `snapshotFromRemote()`.
239+
4. **Reconcile** the snapshot on the remote cluster using `snapshotReconcile()`.
240+
5. **Update** the `snapshot_store_ref.install_path` in the database to reflect the destination path.
241+
6. Invoke the async callback with success or failure.
242+
243+
**Recovery detail saved:**
244+
245+
```java
246+
// Stored so incomplete exports can be cleaned up later
247+
String detail = "~" + snapshotName + ";" + location;
248+
new SnapshotDetailsVO(snapshot.getId(), SP_RECOVERED_SNAPSHOT, detail, true);
249+
```
250+
251+
---
252+
253+
### Step 10 — Post-Snapshot Processing: `postCreateSnapshot()` and Zone/Pool Copies
254+
255+
**File:** `server/src/main/java/com/cloud/storage/snapshot/SnapshotManagerImpl.java`
256+
257+
After snapshot creation (and optional backup):
258+
259+
1. **`postCreateSnapshot()`**: Updates snapshot policy retention — removes the oldest snapshot if the retention count is exceeded.
260+
2. **`snapshotZoneDao.addSnapshotToZone()`**: Associates the snapshot with its origin zone.
261+
3. **Usage event**: Publishes `EVENT_SNAPSHOT_CREATE` with the physical size of the snapshot.
262+
4. **Resource limit correction**: For delta (incremental) snapshots, decrements the pre-allocated resource count by `(volumeSize − snapshotPhysicalSize)` since the actual snapshot is smaller than the volume.
263+
5. **`copyNewSnapshotToZones()`** *(synchronous backup path only)*: Copies the snapshot to secondary storage in additional destination zones.
264+
6. **`copyNewSnapshotToZonesOnPrimary()`** *(synchronous backup path only)*: Copies the snapshot to additional primary storage pools.
265+
266+
---
267+
268+
### Step 11 — Rollback on Failure
269+
270+
**File:** `server/src/main/java/com/cloud/storage/snapshot/SnapshotManagerImpl.java`
271+
272+
The outer `try/catch` in `takeSnapshot()` ensures resource cleanup on any failure:
273+
274+
```java
275+
} catch (CloudRuntimeException | UnsupportedOperationException cre) {
276+
ResourceType storeResourceType = getStoreResourceType(...);
277+
_resourceLimitMgr.decrementResourceCount(snapshotOwner.getId(), ResourceType.snapshot);
278+
_resourceLimitMgr.decrementResourceCount(snapshotOwner.getId(), storeResourceType, volumeSize);
279+
throw cre;
280+
} catch (Exception e) {
281+
// Same resource rollback
282+
throw new CloudRuntimeException("Failed to create snapshot", e);
283+
}
284+
```
285+
286+
**Additional cleanup methods:**
287+
288+
| Method | Trigger | Action |
289+
|---|---|---|
290+
| `cleanupVolumeDuringSnapshotFailure()` | Snapshot creation fails completely | Removes `snapshot_store_ref` entries (non-Destroyed) and deletes the `SnapshotVO` record |
291+
| `cleanupOnSnapshotBackupFailure()` | Async backup exhausts all retries | Transitions snapshot state, removes async job MS_ID, deletes snapshot record |
292+
| `StorPoolSnapshotStrategy.deleteSnapshot()` | Snapshot DELETE operation on StorPool | Calls StorPool API `snapshotDelete`, transitions state, cleans up DB |
293+
294+
---
295+
296+
## Sequence Diagram (Text Form)
297+
298+
```
299+
User/Scheduler
300+
301+
302+
CreateSnapshotCmd.create()
303+
│ allocSnapshot() → SnapshotVO persisted (Allocated state)
304+
305+
CreateSnapshotCmd.execute()
306+
307+
308+
VolumeApiServiceImpl.takeSnapshot()
309+
310+
311+
takeSnapshotInternal()
312+
│ validate volume, account, zones, policies
313+
314+
├── [Volume attached to VM] ─────────────────────────────┐
315+
│ takeVolumeSnapshotThroughJobQueue() │
316+
│ VmWorkTakeVolumeSnapshot dispatched │
317+
│ ← job queue serializes VM operations → │
318+
│ ▼
319+
└── [Volume not attached] ──► orchestrateTakeVolumeSnapshot()
320+
│ build CreateSnapshotPayload
321+
│ volume.addPayload(payload)
322+
323+
SnapshotManagerImpl.takeSnapshot()
324+
325+
│ StorageStrategyFactory.getSnapshotStrategy(TAKE)
326+
327+
snapshotStrategy.takeSnapshot(snapshot)
328+
329+
330+
SnapshotServiceImpl.takeSnapshot()
331+
│ PrimaryDataStoreDriver.takeSnapshot() [async]
332+
│ ← waits on AsyncCallFuture →
333+
│ snapshot created on primary storage
334+
335+
Backup decision
336+
├── BackupSnapshotAfterTakingSnapshot=true
337+
│ backupSnapshotToSecondary() [sync or async]
338+
└── BackupSnapshotAfterTakingSnapshot=false
339+
markBackedUp() / schedule pool copy
340+
341+
postCreateSnapshot()
342+
snapshotZoneDao.addSnapshotToZone()
343+
UsageEventUtils.publishUsageEvent()
344+
_resourceLimitMgr.decrementResourceCount()
345+
copyNewSnapshotToZones() [if zoneIds]
346+
copyNewSnapshotToZonesOnPrimary() [if poolIds]
347+
348+
Return SnapshotInfo to caller
349+
```
350+
351+
---
352+
353+
## Key Classes and Their Roles
354+
355+
| Class | Package | Role |
356+
|---|---|---|
357+
| `CreateSnapshotCmd` | `api/.../command/user/snapshot` | API command entry point; two-phase create+execute |
358+
| `VolumeApiServiceImpl` | `server/.../storage` | Validates, dispatches, and orchestrates snapshot requests |
359+
| `VmWorkTakeVolumeSnapshot` | `engine/components-api/.../vm` | Work item for job queue; carries all snapshot parameters |
360+
| `SnapshotManagerImpl` | `server/.../storage/snapshot` | Core business logic; strategy selection; resource accounting |
361+
| `SnapshotHelper` | `server/.../snapshot` | Resolves storage pool IDs for cross-zone replication |
362+
| `SnapshotServiceImpl` | `engine/storage/snapshot` | Interacts with primary data store driver asynchronously |
363+
| `DefaultSnapshotStrategy` | `engine/storage/snapshot` | Hypervisor-based (file) snapshot implementation |
364+
| `StorageSystemSnapshotStrategy` | `engine/storage/snapshot` | Managed storage native snapshot implementation |
365+
| `StorPoolSnapshotStrategy` | `plugins/storage/volume/storpool` | StorPool native snapshot; handles DELETE and cross-zone COPY |
366+
| `StorageStrategyFactory` | `engine/storage` | Selects the highest-priority strategy for each operation |
367+
368+
---
369+
370+
## Key Configuration Parameters
371+
372+
| Parameter | Default | Description |
373+
|---|---|---|
374+
| `backup.snapshot.after.taking.snapshot` (`BackupSnapshotAfterTakingSnapshot`) | `true` | Whether to back up snapshot to secondary storage after creation |
375+
| `snapshot.backup.retries` | `3` | Number of retry attempts for asynchronous snapshot backup |
376+
| `snapshot.backup.retry.interval` | `300` (seconds) | Interval between retry attempts for async backup |
377+
| `use.storage.replication` | `false` | Use native storage replication (e.g., StorPool cross-zone copy) instead of secondary storage copy |
378+
| `snapshot.copy.multiply.exp.backoff` || Exponential backoff configuration for snapshot copy retries |
379+
380+
---
381+
382+
## Rollback Summary
383+
384+
CloudStack implements rollback at multiple layers to maintain consistency:
385+
386+
1. **Resource limit rollback** — On any exception in `SnapshotManagerImpl.takeSnapshot()`, snapshot count and storage quotas are decremented back to their original values.
387+
2. **Volume state rollback**`Volume.Event.OperationFailed` is fired so the volume returns to `Ready` state.
388+
3. **Snapshot state machine** — Snapshot transitions to `Error` or `Destroyed` so it can be cleaned up by the background expunge process.
389+
4. **Async backup failure cleanup** — After exhausting all retries, `cleanupOnSnapshotBackupFailure()` runs in a transaction to delete the snapshot record and associated job metadata.
390+
5. **StorPool cross-zone recovery** — The exported (but not yet imported) snapshot name is persisted in `snapshot_details` with the key `SP_RECOVERED_SNAPSHOT`, enabling manual or automated cleanup of partial cross-zone copies.

0 commit comments

Comments
 (0)