feat(server): generate all thumbnails for an asset in one job #13012

mertalev · 2024-09-28T06:28:01Z

Description

This PR changes image thumbnail generation to first decode and resize the input image to a preview-sized raw buffer, then reuse that buffer as the input for the preview, thumbnail and thumbhash. This has several advantages:

As reading from disk and decoding the image are the slowest steps of thumbnail generation, this makes generating thumbnails significantly faster
It is more cache-efficient due to handling everything to do with a file at once instead of spreading this out into jobs that run far apart from one another
It is easier to understand thumbnail generation progress when there is a 1:1 relationship between jobs and assets. It's currently a common source of confusion that the number of jobs keeps growing upward, and it's difficult to understand how much progress has been made
Other steps during thumbnail generation only need to run once, such as extracting embedded images or running FFprobe

A concern I had was around RAM usage. Remarkably, this approach uses less RAM than the current one despite the use of an in-memory buffer. Peak memory usage dropped from 2.47GiB to 1.22GiB during testing, and the average is also lower compared to preview generation.

Video thumbnails are not optimized in this PR to keep it smaller and more focused. A later PR will use a similar approach through FFmpeg. In the meantime, I refactored the code such that it will be easy to make this change. I intentionally made it generate video thumbnails sequentially instead of using Promise.all to avoid excessive resource usage.

Testing

I ran thumbnail generation on all assets and confirmed that jobs were processed successfully. Setting different image settings and enabling the "extract embedded preview" setting all functioned as expected. I additionally ran an image-only comparison on all assets:

In order from left to right:

Thumbnail generation start on main
Thumbnail generation previews completed on main (starting thumbnails)
Thumbnail generation finished on main
Thumbnail generation started in PR
Thumbnail generation finished in PR

On main, generating the previews took 43 minutes and completing all jobs took 76 minutes , while this PR completed all thumbnail generation tasks in 44 minutes. For this particular library with default settings, this is a ~73% speedup for images. The gap will widen with more RAW or high megapixel images and when the "extract embedded preview" setting is enabled.

cleanup add success logs, rename method do thumbhash too fixes fix tests handle `notify` wip refactor refactor

jrasm91

Looks great!

jrasm91 · 2024-09-28T12:02:43Z

server/src/repositories/media.repository.ts

-    if (options.crop) {
-      pipeline.extract(options.crop);
-    }
+  async decodeImage(input: string, options: DecodeToBufferOptions) {


Unnecessary async, right?

jrasm91 · 2024-09-28T12:09:00Z

server/src/services/media.service.spec.ts

+      'should generate a %s %s for an image when specified',
+      async ([format, type]) => {
+        systemMock.get.mockResolvedValue({ image: { [type]: { format: format as ImageFormat } } });
+        assetMock.getById.mockResolvedValue(assetStub.image);
+        const thumbhashBuffer = Buffer.from('a thumbhash', 'utf8');
+        mediaMock.generateThumbhash.mockResolvedValue(thumbhashBuffer);

-    it('should skip invisible assets', async () => {
-      assetMock.getByIds.mockResolvedValue([assetStub.livePhotoMotionAsset]);
+        const previewFormat = type === AssetFileType.PREVIEW ? format : ImageFormat.JPEG;
+        const thumbnailFormat = type === AssetFileType.THUMBNAIL ? format : ImageFormat.WEBP;
+        const previewPath = `upload/thumbs/user-id/as/se/asset-id-preview.${previewFormat}`;
+        const thumbnailPath = `upload/thumbs/user-id/as/se/asset-id-thumbnail.${thumbnailFormat}`;



This test has little value imo. Looking at it, it is not clear what it is doing. It takes reading and understanding the test code and checking the different branching going on here. Tests should not have branching and should be very simple. I would prefer 4 individual, trivial tests that are easy to read.

jrasm91 · 2024-09-28T12:12:04Z

server/src/services/media.service.ts

-    await this.assetRepository.upsertFile({ assetId: asset.id, type: AssetFileType.THUMBNAIL, path: thumbnailPath });
-    await this.assetRepository.update({ id: asset.id, updatedAt: new Date() });
-    await this.assetRepository.upsertJobStatus({ assetId: asset.id, thumbnailAt: new Date() });
+    await this.assetRepository.upsertJobStatus({ assetId: asset.id, previewAt: new Date(), thumbnailAt: new Date() });


In the future should we combine these into one column?

wip

f860629

cleanup add success logs, rename method do thumbhash too fixes fix tests handle `notify` wip refactor refactor

mertalev requested a review from danieldietzler as a code owner September 28, 2024 06:28

mertalev added the changelog:enhancement label Sep 28, 2024

github-actions bot added the 🗄️server label Sep 28, 2024

update tests

ad1b572

mertalev force-pushed the perf/server-preview-thumbnail-together branch from a233104 to ad1b572 Compare September 28, 2024 07:28

update sql

09fb41a

jrasm91 approved these changes Sep 28, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(server): generate all thumbnails for an asset in one job #13012

feat(server): generate all thumbnails for an asset in one job #13012

mertalev commented Sep 28, 2024

jrasm91 left a comment

jrasm91 Sep 28, 2024

jrasm91 Sep 28, 2024

jrasm91 Sep 28, 2024

feat(server): generate all thumbnails for an asset in one job #13012

Are you sure you want to change the base?

feat(server): generate all thumbnails for an asset in one job #13012

Conversation

mertalev commented Sep 28, 2024

Description

Testing

jrasm91 left a comment

Choose a reason for hiding this comment

jrasm91 Sep 28, 2024

Choose a reason for hiding this comment

jrasm91 Sep 28, 2024

Choose a reason for hiding this comment

jrasm91 Sep 28, 2024

Choose a reason for hiding this comment