Oftentimes you may run into situations where some of the assets failed to migrate (will be reported by the tool).
To aid you in identifying and addressing those the tool produces migration report CSV file on execution.
The migration report file will include all the columns from the initial input file with a few additional columns to allow you filter out records for the assets that failed to migrate.
And because the report file contains all the same columns as the initial migration input - this filtered migration report can be used as input for a subsequent "recovery" run of the migration script.
The migration script adds the following additional columns to the migration report (in addition to the columns from the input CSV file):
Cld_Status
: set toMIGRATED
for successfully migrated assetsCld_Operation
: identifies if the asset was created or overwrittenOverwritten
- if the Cloudinary asset already existed and was overwritten- This may indicate undesired behavior, for example if several assets in the migration input file were assigned the same
public_id
- This may indicate undesired behavior, for example if several assets in the migration input file were assigned the same
Uploaded
- if a new Cloudinary asset was createdSkippedAlreadyExists
- indicates that upload operation was not performed because:overwrite
upload API parameter was set tofalse
- AND asset with the
public_id
value specified for the upload already exists
Cld_Error
: the error details for troubleshooting (if an asset failed to migrate)Cld_PublicId
:public_id
reported back by Cloudinary after uploading an asset- Should be used as "source of truth" when addressing migrated assets via Cloudinary API (as Cloudinary may have to replace some of the characters)
Cld_Etag
: An MD5 digest of the binary content, useful for identifying identical assets.
Using toolset of your choice (for example Excel, PowerShell, Python etc.) filter out CSV records that have Cld_Status
column value different from MIGRATED
.
Review the information in the Cld_Error
column.
This is the creative part of the process. Typically you'll spot certain patterns in the messages that will hint you as to what went wrong and what needs to be adjusted before re-attempting migration for these assets.
Errors that identify failure to retrieve asset due to network issues:
Error in loading <asset_url> - Timed out reading data from server
Error in loading <asset_url> - Server broke connection
Error in loading <asset_url> - partial download
and similar may occur due to network "hiccups" between the system of origin and Cloudinary back-end systems at the time of the migration.
These would typically be resolved by simply re-attempting migration.
Run the migration using the "filtered" CSV file you've produced.
It is up to you how you want to organize these re-attempts (by filtering out one problem at a time for each "recovery" run or attempting to address all problems in the same "recovery" run).
A "recovery" run can be identified by the folder passed as value for the --output-folder
parameter of the migration script.
For example:
initial-migration
folder- for the first run
recovery/fixing-public-ids/first-attempt
folder- for the "recovery" batch that fixes issues with
public_id
values
- for the "recovery" batch that fixes issues with
recovery/fixing-public-ids/second-attempt
folder- for the "recovery" batch that fixes issues with
public_id
values that may have been omitted on the first run
- for the "recovery" batch that fixes issues with
recovery/reattemtping-network-issues/first-attempt
folder- for the "recovery" batch that re-attempts assets failing to migrate due to network issues
A well-structured folder naming strategy makes it easier to consolidate all the reports into a final migration report.