Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid data corruption due to writing concurrently in the user data file #366

Merged
merged 6 commits into from
Jul 30, 2024

Conversation

fluiddot
Copy link
Contributor

@fluiddot fluiddot commented Jul 17, 2024

Related to 6197-gh-Automattic/dotcom-forge and 6887-gh-Automattic/dotcom-forge.

Proposed Changes

  • Add atomically library to the project.
  • Use the atomic write file function when saving user data. This will ensure that the file's content is not corrupted due to writing simultaneously.

Testing Instructions

Test corruption before this PR changes:

  1. Apply the following patch:
diff --git forkSrcPrefix/src/storage/user-data.ts forkDstPrefix/src/storage/user-data.ts
index 0b9de151217e11f8ae451dc9295d71424409f309..16eb8c675991d164cc118d14d2956c819316ed41 100644
--- forkSrcPrefix/src/storage/user-data.ts
+++ forkDstPrefix/src/storage/user-data.ts
@@ -56,7 +56,7 @@ export async function loadUserData(): Promise< UserData > {
 			const data = fromDiskFormat( parsed );
 			sortSites( data.sites );
 			populatePhpVersion( data.sites );
-			console.log( `Loaded user data from ${ sanitizeUserpath( filePath ) }` );
+			// console.log( `Loaded user data from ${ sanitizeUserpath( filePath ) }` );
 			return data;
 		} catch ( err ) {
 			// Awkward double try-catch needed to have access to the file contents
@@ -82,12 +82,56 @@ export async function loadUserData(): Promise< UserData > {
 	}
 }
 
+let queueWriteOps: { filePath: string; asString: string; format: BufferEncoding }[] = [];
+let timeoutId: NodeJS.Timeout;
+
+const write = async ( filePath: string, asString: string, format: BufferEncoding ) => {
+	console.log( 'queueing write op ', queueWriteOps.length );
+	queueWriteOps.push( { filePath, asString, format } );
+	clearTimeout( timeoutId );
+	const totalOps = queueWriteOps.length;
+	timeoutId = setTimeout( () => {
+		console.log( 'flushing writing ops' );
+		queueWriteOps.forEach( ( op, index ) => {
+			setTimeout( () => {
+				console.log( 'writing file', new Date() );
+				fs.promises.writeFile( op.filePath, op.asString, op.format );
+			}, totalOps - index );
+		} );
+		// Clean up queue
+		queueWriteOps = [];
+	}, 1000 );
+};
+( function testConcurrentWrites() {
+	console.log( '=== Testing Concurrent Write Ops ===' );
+	const filePath = getUserDataFilePath();
+	const generateData = () => ( {
+		a: Math.random() * 10000,
+		b: Math.random() * 10000,
+		c: Math.random() * 10000,
+		d: Math.random() * 10000,
+		e: Math.random() * 10000,
+	} );
+	loadUserData().then( ( data ) => {
+		write( filePath, JSON.stringify( { ...data, test: generateData() }, null, 2 ) + '\n', 'utf-8' );
+		write(
+			filePath,
+			JSON.stringify(
+				{ ...data, test: [ generateData(), generateData(), generateData(), generateData() ] },
+				null,
+				2
+			) + '\n',
+			'utf-8'
+		);
+	} );
+} )();
+
 export async function saveUserData( data: UserData ): Promise< void > {
 	const filePath = getUserDataFilePath();
 
 	const asString = JSON.stringify( toDiskFormat( data ), null, 2 ) + '\n';
-	await fs.promises.writeFile( filePath, asString, 'utf-8' );
-	console.log( `Saved user data to ${ sanitizeUserpath( filePath ) }` );
+	// await fs.promises.writeFile( filePath, asString, 'utf-8' );
+	// console.log( `Saved user data to ${ sanitizeUserpath( filePath ) }` );
 }
 
 function toDiskFormat( { sites, ...rest }: UserData ): PersistedUserData {
  1. Check out the trunk branch (i.e. test the app before this change).
  2. Run the command npm start.
  3. Check the app's logs and wait until the entries writing file are shown.
  4. Check if the content of the appdata-v1.json file is corrupted. The common error is that the JSON structure is broken.
  5. If the data is not corrupted, type rs and press enter in the terminal to reload the app. Repeat steps 4, 5, and 6 until you encounter the issue.

Corrupted data example:

...
  "test": {
    "a": 3592.503827042464,
    "b": 3707.5094131989817,
    "c": 3279.462265383579,
    "d": 1654.6072404597733,
    "e": 100.2686994394808
  }
}
3609597
    },
    {
      "a": 6627.617631995424,
      "b": 5073.959268038914,
      "c": 9778.40044590038,
      "d": 6536.643341564523,
      "e": 8928.491741332873
    },
    {
...

Test this PR changes:

  1. Apply the following patch:
diff --git forkSrcPrefix/src/storage/user-data.ts forkDstPrefix/src/storage/user-data.ts
index c25007d112a0339368d284475c1a7d9eb1cf37d2..d17aa337e0099a3b9d2e0a3c47973ee1d10bec4f 100644
--- forkSrcPrefix/src/storage/user-data.ts
+++ forkDstPrefix/src/storage/user-data.ts
@@ -3,6 +3,7 @@ import fs from 'fs';
 import nodePath from 'path';
 import { SupportedPHPVersion, SupportedPHPVersions } from '@php-wasm/universal';
 import * as Sentry from '@sentry/electron/main';
+import * as atomically from 'atomically';
 import { isErrnoException } from '../lib/is-errno-exception';
 import { sanitizeUnstructuredData, sanitizeUserpath } from '../lib/sanitize-for-logging';
 import { sortSites } from '../lib/sort-sites';
@@ -57,7 +58,7 @@ export async function loadUserData(): Promise< UserData > {
 			const data = fromDiskFormat( parsed );
 			sortSites( data.sites );
 			populatePhpVersion( data.sites );
-			console.log( `Loaded user data from ${ sanitizeUserpath( filePath ) }` );
+			// console.log( `Loaded user data from ${ sanitizeUserpath( filePath ) }` );
 			return data;
 		} catch ( err ) {
 			// Awkward double try-catch needed to have access to the file contents
@@ -83,22 +84,68 @@ export async function loadUserData(): Promise< UserData > {
 	}
 }
 
+let queueWriteOps: { filePath: string; asString: string; format: BufferEncoding }[] = [];
+let timeoutId: NodeJS.Timeout;
+
+const write = async ( filePath: string, asString: string, format: BufferEncoding ) => {
+	console.log( 'queueing write op ', queueWriteOps.length );
+	queueWriteOps.push( { filePath, asString, format } );
+	clearTimeout( timeoutId );
+	const totalOps = queueWriteOps.length;
+	timeoutId = setTimeout( () => {
+		console.log( 'flushing writing ops' );
+		queueWriteOps.forEach( ( op, index ) => {
+			setTimeout( async () => {
+				console.log( 'writing file', new Date() );
+				atomically.writeFile( op.filePath, op.asString, 'utf-8' );
+			}, totalOps - index );
+		} );
+		// Clean up queue
+		queueWriteOps = [];
+	}, 1000 );
+};
+
+( function testConcurrentWrites() {
+	console.log( '=== Testing Concurrent Write Ops ===' );
+	const filePath = getUserDataFilePath();
+	const generateData = () => ( {
+		a: Math.random() * 10000,
+		b: Math.random() * 10000,
+		c: Math.random() * 10000,
+		d: Math.random() * 10000,
+		e: Math.random() * 10000,
+	} );
+	loadUserData().then( ( data ) => {
+		// write( filePath, JSON.stringify( data, null, 2 ) + '\n', 'utf-8' );
+		write( filePath, JSON.stringify( { ...data, test: generateData() }, null, 2 ) + '\n', 'utf-8' );
+		write(
+			filePath,
+			JSON.stringify(
+				{ ...data, test: [ generateData(), generateData(), generateData(), generateData() ] },
+				null,
+				2
+			) + '\n',
+			'utf-8'
+		);
+	} );
+} )();
+
 export async function saveUserData( data: UserData ): Promise< void > {
 	const filePath = getUserDataFilePath();
 
 	const asString = JSON.stringify( toDiskFormat( data ), null, 2 ) + '\n';
-	try {
-		await atomically.writeFile( filePath, asString, 'utf-8' );
-	} catch ( error ) {
-		// Fall back to FS function in case the writing fails with EXDEV error.
-		// This issue might happen on Windows when renaming a file.
-		// Reference: https://github.com/sindresorhus/electron-store/issues/106
-		// eslint-disable-next-line @typescript-eslint/no-explicit-any
-		if ( ( error as any )?.code === 'EXDEV' ) {
-			await fs.promises.writeFile( filePath, asString, 'utf-8' );
-		}
-	}
-	console.log( `Saved user data to ${ sanitizeUserpath( filePath ) }` );
+	// try {
+	// 	await atomically.writeFile( filePath, asString, 'utf-8' );
+	// } catch ( error ) {
+	// 	// Fall back to FS function in case the writing fails with EXDEV error.
+	// 	// This issue might happen on Windows when renaming a file.
+	// 	// Reference: https://github.com/sindresorhus/electron-store/issues/106
+	// 	// eslint-disable-next-line @typescript-eslint/no-explicit-any
+	// 	if ( ( error as any )?.code === 'EXDEV' ) {
+	// 		await fs.promises.writeFile( filePath, asString, 'utf-8' );
+	// 	}
+	// }
+	// console.log( `Saved user data to ${ sanitizeUserpath( filePath ) }` );
 }
 
 function toDiskFormat( { sites, ...rest }: UserData ): PersistedUserData {
  1. Check out this PR.
  2. Run the command npm start.
  3. Check the app's logs and wait until the entries writing file are shown.
  4. Check if the content of the appdata-v1.json file is corrupted. The common error is that the JSON structure is broken.
  5. If the data is not corrupted, type rs and press enter in the terminal to reload the app. Repeat steps 4, 5 and 6 multiple times to ensure that content is not corrupted.
  6. Observe that the content of the appdata-v1.json file doesn't get corrupted.

Note

The property test will be appended to the user data file. It shouldn't impact the app, but it's recommended to remove it from the file after finishing testing.

Pre-merge Checklist

  • Have you checked for TypeScript, React or other console errors?

@fluiddot fluiddot self-assigned this Jul 17, 2024
@fluiddot fluiddot requested a review from a team July 17, 2024 16:25
Comment on lines +90 to +100
try {
await atomically.writeFile( filePath, asString, 'utf-8' );
} catch ( error ) {
// Fall back to FS function in case the writing fails with EXDEV error.
// This issue might happen on Windows when renaming a file.
// Reference: https://github.com/sindresorhus/electron-store/issues/106
// eslint-disable-next-line @typescript-eslint/no-explicit-any
if ( ( error as any )?.code === 'EXDEV' ) {
await fs.promises.writeFile( filePath, asString, 'utf-8' );
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fluiddot
Copy link
Contributor Author

@wojtekn I've just noticed you added the label Not Ready for Merge, not sure if it was related to unit tests not passing. If so, I fixed them in the latest commits.

@wojtekn
Copy link
Contributor

wojtekn commented Jul 18, 2024

@fluiddot unrelated. I added that to ensure we won't merge this significant change until Monday's release so we can have more time to test it before the next release.

@fluiddot fluiddot changed the title Avoid user data corruption due to writing concurrently in the file Avoid data corruption due to writing concurrently in the user data file Jul 18, 2024
@fluiddot
Copy link
Contributor Author

Removing [Status] Not Ready for Merge] label now that version 1.0.6 has been created.

Copy link
Member

@sejas sejas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirm that adding the test functions doesn't corrupt the appdata-v1.json file.

Screenshot 2024-07-29 at 13 29 14

After removing the temporal code, I've also tested creating, removing sites and creating demo sites. The app works as expected.

// Reference: https://github.com/sindresorhus/electron-store/issues/106
// eslint-disable-next-line @typescript-eslint/no-explicit-any
if ( ( error as any )?.code === 'EXDEV' ) {
await fs.promises.writeFile( filePath, asString, 'utf-8' );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should move this await fallback to a separate try/catch. If the regular writeFile fails it will crash the app. I guess that will be captured by Sentry anyway, but maybe we can fail gracefully.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the failure should be handled by the caller. For instance, if there's a write error when creating a site, we should let the IPC handler catch the error and fail gracefully. WDYT?

@fluiddot fluiddot merged commit d10773f into trunk Jul 30, 2024
12 checks passed
@fluiddot fluiddot deleted the fix/write-user-data-atomically branch July 30, 2024 08:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants