Skip to content

Commit

Permalink
Update use_cases.md
Browse files Browse the repository at this point in the history
  • Loading branch information
terrywbrady authored May 21, 2024
1 parent cef72bb commit b62ca8f
Showing 1 changed file with 0 additions and 242 deletions.
242 changes: 0 additions & 242 deletions design/object-reset/use_cases.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,245 +68,3 @@ During the scope of a large ingest effort, a desire arises to normalize pathname
#### Issue: Purge unwanted files from old object versions


---

## Implmentation Options

### Scenario
```yaml
ark: ark:/111/222
versions:
- num: 1
files:
producer/foo: { size: 10, digest: 'AAA', path: "1/producer/foo" }
producer/bar: { size: 20, digest: 'AAB', path: "1/producer/bar" }
producer/cat: { size: 30, digest: 'AAC', path: "1/producer/cat" }
size: 60
- num: 2
files:
producer/foo: { size: 40, digest: 'FFF', path: "2/producer/foo" }
producer/bar: { size: 20, digest: 'AAB', path: "1/producer/bar" }
producer/dog: { size: 30, digest: 'AAC', path: "2/producer/dog" }
size: 70
- num: 3
files:
producer/foo: { size: 40, digest: 'EEE', path: "3/producer/foo" }
producer/bar: { size: 20, digest: 'AAB', path: "1/producer/bar" }
producer/dog: { size: 30, digest: 'AAC', path: "2/producer/dog" }
size: 40
size: 170
```
### Option: Mark Current Version as the Reset Version
```yaml
ark: ark:/111/222
versions:
- num: 1
files:
producer/bar: { size: 20, digest: 'AAB', path: "1/producer/bar" }
size: 20
- num: 2
files:
producer/dog: { size: 30, digest: 'AAC', path: "2/producer/dog" }
size: 30
- num: 3
files:
producer/foo: { size: 40, digest: 'EEE', path: "3/producer/foo" }
producer/bar: { size: 20, digest: 'AAB', path: "1/producer/bar" }
producer/dog: { size: 30, digest: 'AAC', path: "2/producer/dog" }
size: 40
size: 90
```
#### Process
- User must fix the current version of the object using existing tools (ingest updates & mrt-delete)
- New transaction is run to mark the state of the current object as definitive
- Is this sent through ingest?
- Is this an administrative action sent to inventory?
- Any files not found in the current version object will be removed from the storage manifest
- Inventory will remove these files
- These files will be eligible for deletion from cloud storage
- Using the storage scan process?
- Using new file delete operations
#### Pros
- No files are cloned/copied
- Version number stays intact
- Minimizes processing
#### Cons
- Relies heavily on the existing mrt-delete process to reconcile the current version
- No mechanism exists to normalize or correct pathnames
- No evidence is left to indicate that this process took place
---
### Option: Pull forward desired files using an ingest manifest exported from storage
#### Step 1: Pull Forward
```yaml
ark: ark:/111/222
versions:
- num: 1
files:
producer/foo: { size: 10, digest: 'AAA', path: "1/producer/foo" }
producer/bar: { size: 20, digest: 'AAB', path: "1/producer/bar" }
producer/cat: { size: 30, digest: 'AAC', path: "1/producer/cat" }
size: 60
- num: 2
files:
producer/foo: { size: 40, digest: 'FFF', path: "2/producer/foo" }
producer/bar: { size: 20, digest: 'AAB', path: "1/producer/bar" }
producer/dog: { size: 30, digest: 'AAC', path: "2/producer/dog" }
size: 70
- num: 3
files:
producer/foo: { size: 40, digest: 'EEE', path: "3/producer/foo" }
producer/bar: { size: 20, digest: 'AAB', path: "1/producer/bar" }
producer/dog: { size: 30, digest: 'AAC', path: "2/producer/dog" }
size: 40
- num: 4
files:
producer/foo: { size: 40, digest: 'EEE', path: "4/producer/foo" }
producer/bar: { size: 20, digest: 'AAB', path: "4/producer/bar" }
producer/dog: { size: 30, digest: 'AAC', path: "4/producer/dog" }
size: 90
reload: true
size: 260
```
#### Step 2: Purge Old Versions
```yaml
ark: ark:/111/222
versions:
- num: 4
files:
producer/foo: { size: 40, digest: 'EEE', path: "4/producer/foo" }
producer/bar: { size: 20, digest: 'AAB', path: "4/producer/bar" }
producer/dog: { size: 30, digest: 'AAC', path: "4/producer/dog" }
size: 90
reload: true
size: 90
```
#### Process
- Ingest manifest is generated from Merritt Storage
- Desired files are selected and renamed as needed
- Manifest is sent to Ingest as a new request type of `replace`
- A new version is generated containing a copy of all desired content
- Storage manifest should reflect that the replacement took place at a specific version
- Files from prior versions should be treated as if they no longer exist
- Older version files will be purge-able by inventory
- Older version files will be purge-able from cloud storage
- A purge process should exist to purge the older version content

#### Pros
- Use can review the new state of the object before electing to purge the old versions
- The storage manifest and the version numbering will convey that this maintenance activity took place

#### Cons
- Temporary duplication of object content
- Duplication could persist if the user does not enable / approve the running of the purge process
- Replication logic will need to change to not assume that versioning starts at 1


---

### Option: Modify Merritt file paths to use content hash

```yaml
ark: ark:/111/222
versions:
- num: 1
files:
producer/foo: { size: 10, digest: 'AAA', path: "AAA/10" }
producer/bar: { size: 20, digest: 'AAB', path: "AAB/20" }
producer/cat: { size: 30, digest: 'AAC', path: "AAC/30" }
size: 60
- num: 2
files:
producer/foo: { size: 40, digest: 'FFF', path: "FFF/40" }
producer/bar: { size: 20, digest: 'AAB', path: "AAB/20" }
producer/dog: { size: 30, digest: 'AAC', path: "AAC/30" }
size: 40
- num: 3
files:
producer/foo: { size: 40, digest: 'EEE', path: "EEE/40" }
producer/bar: { size: 20, digest: 'AAB', path: "AAB/20" }
producer/dog: { size: 30, digest: 'AAC', path: "AAC/30" }
size: 40
size: 140
```

#### Process
- Storage key should be `ark | hash | length` regardless of pathname/version
- Comment dloy - this only becomes needed if we allow a reset to preserve more than one version - preserving only current not needed

#### Pros
- Storage optimized solution - duplicate files are only stored once per object

#### Cons
- Loss of semantically meaningful key names (dloy-serious problem)

---

### Option: Reset Version should always become version 1

- _dloy : see below._

#### Process

#### Pros

#### Cons


---

### Option: Reset object using a new add version

#### Process
- a version add with reset request is received by ingest
- storage deletes the content for the specific ark
- all add version content is treated as new using the original ark
- inventory receives reset request
- inventory deletes all existing content on the ark
- inventory treats the added content as new

#### Pros
- same key architecture
- delete and add functions do not change for storage and inventory
- resulting storage sets all content as 1 so specific match of content to saved keys
- all changes in new add populate merritt inv without requesting hybrid processing
- preserving the ark alloww outside references to continue working

#### Cons
- all content will be written as version 1 so no earlier content is saved - redundant processing
- recovery of failed process may be issue

---

### Option: Reset object using existing current version
- Note this is similar to _Option: Reset Version should always become version 1_ above

#### Process
- a reset request is received by ingest - without action
- storage passed reset
- storage uses content S3 keys to copy (or leave alone) to a version 1 key. If current file uses previous version 10 then version 10 copied to version 1. Current architecture will not have collisions if only current is preserved
- storage delete all unused content for specific ark (key version > 1)
- inventory deletes all existing content on the ark
- inventory treats the added content as new

#### Pros
- same key architecture
- resulting storage sets all content as 1 so specific match of content to saved keys
- all changes in new add populate merritt inv without requesting hybrid processing
- preserving the ark alloww outside references to continue working

#### Cons
- new build copy logic required in storage
- new delete logic required for this special case
- Much content will be rewritten except for matching
- recovery of failed process may be issue

0 comments on commit b62ca8f

Please sign in to comment.