Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New media processing pipeline #680

Draft
wants to merge 53 commits into
base: stable
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
53 commits
Select commit Hold shift + click to select a range
bfb875b
Detect media files info using ImageMagick and Ffmpeg
davidmz Dec 17, 2024
ab673f9
Install ImageMagick and Ffmpeg in "checks" workflow machine
davidmz Dec 17, 2024
df2d082
Made filehandle.read call compatible with Node18
davidmz Dec 17, 2024
a0adead
Add files of type "general"
davidmz Dec 17, 2024
95304f2
Use ImageMagick instead of GraphicsMagick
davidmz Dec 23, 2024
d00a7d3
Detect/suggest the original file extension
davidmz Dec 24, 2024
3369e00
Detect rotation for images and video
davidmz Dec 25, 2024
0003efa
Add an algorithm to calculate the size of the preview images
davidmz Dec 26, 2024
05e1c6f
Fix rotated fixture file
davidmz Dec 29, 2024
0b11578
Add types for 'mime-types' package
davidmz Dec 29, 2024
2d19cf8
Use new Attachment fabric, update some tests
davidmz Dec 30, 2024
5723aea
Check the original JPEG for use as a largest preview
davidmz Dec 31, 2024
9bcc83a
Present old 'image_sizes' data in a new format
davidmz Dec 31, 2024
14fa3e4
Update the orientation tests
davidmz Jan 1, 2025
effcafa
Use Attachment.create in user deletion test
davidmz Jan 1, 2025
c0f6924
Refactor the S3 emulation in tests
davidmz Jan 2, 2025
75b92b4
Use Attachment.create everywhere
davidmz Jan 2, 2025
c78578b
Update geometry tests
davidmz Jan 2, 2025
ece30c4
Refactor some path-related work
davidmz Jan 2, 2025
641b0fc
Emit the files list for 'general' file type
davidmz Jan 2, 2025
a914e17
Fix Blob construction
davidmz Jan 2, 2025
ce19809
Remove obsolete test of WebP attachment
davidmz Jan 2, 2025
4bd2ae4
Rewrite the legacy attachment serializer
davidmz Jan 3, 2025
9b65853
Update the createMockAttachmentAsync method
davidmz Jan 3, 2025
0da0f8b
Update OpenGraph image handling
davidmz Jan 3, 2025
858f971
Use the maxSizedVariant helper
davidmz Jan 3, 2025
8be54ab
Check the required attachment subdirectories on server start
davidmz Jan 3, 2025
04a3f87
Remove unused methods of Attachment
davidmz Jan 3, 2025
185353b
Use ImageMagick CLI instead of 'gm' package
davidmz Jan 3, 2025
a52326b
Remove graphicsmagick from the "checks" image
davidmz Jan 3, 2025
f3813e9
Add 'meta' column to the attachments table
davidmz Jan 3, 2025
6fa36ba
Handle multi-frame images properly
davidmz Jan 4, 2025
9041ca0
Extract additional info from AVC files
davidmz Jan 7, 2025
07ebb2b
Add geometry calculations for video
davidmz Jan 8, 2025
480427f
Create previews subdirectories in runtime
davidmz Jan 8, 2025
88d768b
Change the spawnAsync args type, allow to use array of arrays
davidmz Jan 9, 2025
5d6975c
Use smaller video fixtures
davidmz Jan 10, 2025
6427db1
Add video processing (synchronous, for now)
davidmz Jan 10, 2025
04e9b92
Update dockerfile instructions
davidmz Jan 10, 2025
103c9af
Add "width", "height" and "duration" fields to the attachments table
davidmz Jan 10, 2025
c89b41d
Serialize animated images
davidmz Jan 10, 2025
ad99156
Better detect ffprobe errors
davidmz Jan 11, 2025
0283e92
Refactor some types
davidmz Jan 11, 2025
10a5fc5
Early exit on error in ffmpeg
davidmz Jan 11, 2025
cd756fa
Add 'silent' field to attachment metadata
davidmz Jan 11, 2025
0cff487
Allow to limit the number of simultaneous executions for some job types
davidmz Nov 28, 2024
69e40dd
Process video files using deferred job
davidmz Jan 12, 2025
b87ee73
Add image/avif to the inline mime types
davidmz Jan 13, 2025
f17eb2d
Add tests for the file extensions and Content-Disposition's
davidmz Jan 13, 2025
a83133a
Add realtime notification on attachment update
davidmz Jan 14, 2025
41afd8d
Add new (v4) attachment serializer and 'GET /attachments/:attId' method
davidmz Jan 17, 2025
dccca74
Don't check every attachments subdirs on start
davidmz Jan 17, 2025
c0e18bb
Update changelog and api_versions file
davidmz Jan 17, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 3 additions & 5 deletions .github/workflows/checks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,10 +29,11 @@ jobs:
with:
redis-version: ${{ matrix.redis-version }}

- name: install GraphicsMagick
- name: Install media tools
run: |
sudo apt-get update
sudo apt-get install graphicsmagick
sudo apt-get install imagemagick
sudo apt-get install ffmpeg

- uses: actions/checkout@v3

Expand All @@ -56,9 +57,6 @@ jobs:
- name: create directories for attachments
run: |
mkdir -p /tmp/pepyatka-media/attachments
mkdir /tmp/pepyatka-media/attachments/thumbnails
mkdir /tmp/pepyatka-media/attachments/thumbnails2
mkdir /tmp/pepyatka-media/attachments/anotherTestSize

- name: Install dependencies
run: yarn
Expand Down
34 changes: 34 additions & 0 deletions API_VERSIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,40 @@ All backward-incompatible FreeFeed API changes will be documented in this file.
See the [About API versions](#about-api-versions) section in the end of this
file for the general versioning information.

## [4] - 2025-02-01
### Changed
- The attachment serialization is changed. The new format contains the following
fields:
- _id_ (string) - the UUID of the attachment
- _mediaType_ (string) - the media type of the attachment, one of 'image',
'video', 'audio', 'general'
- _fileName_ (string) - the original filename of the attachment
- _fileSize_ (number) - the size of the attachment's original file in bytes
- _previewTypes_ (array of string) - the array of available preview types of
the attachment, can be empty or contains the following values: 'image',
'video', 'audio'
- _meta_ (object) - optional field with temporary or not essential media
metadata (all fields are optional):
- _dc:title_: the audio/video title
- _dc:creator_: the audio/video author name
- _animatedImage_: true if the video was created from an animated image
- _silent_: true if the video has no audio track
- _inProgress_: true if the media file is currently being processed
- _width_ and _height_ (number) - the size of the original image/video file in
pixels, presents only for 'image' and 'video' attachments, and when the
processing is done
- _duration_ (number) - the duration of the audio/video file in seconds,
present only for 'audio' and 'video' attachments, and when the processing is
done
- _previewWidth_ and _previewHeight_ (number) - the size of the maximum
available image/video preview in pixels, presents only when different from
the _width_ and _height_
- _postId_ (string|null) - the UUID of the post to which the attachment is
attached
- _createdBy_ (string) - the UUID of the user who uploaded the attachment
- _createdAt_ (string) - the ISO 8601 datetime when the attachment was created
- _updatedAt_ (string) - the ISO 8601 datetime when the attachment was updated

## [3] - 2024-06-21

### Changed
Expand Down
49 changes: 49 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,55 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [2.23.0] - Not released
### Changed
- The media files (attachments) handling algorithm has been changed. There are
four media types now: 'image', 'video', 'audio' and 'general'. Images are
accepted in JPEG, PNG, WebP, GIF, HEIC/HEIF and AVIF formats. Also now we
accept and process arbitrary formats of video and audio files (detected with
ffmpeg).

For the visual files (images and videos), the multiple preview sizes are
created, in addition to the legacy 'thumbnail' and 'thumbnail2'.

The animated GIF images are now treated as video files and the video previews
are created for them.

We don't keep the originals for the truly (not from animated images) video
files. After the preview creation, the largest preview is kept as the
'original'.

Some media files (the truly video ones for now) are processed asynchronously.
Right after they are uploaded to the server, the asynchronous job is
scheduled, and after the job finishes, the 'attachment:update' realtime event
is sent to the 'user:{ownerId}', 'attachment:{attachmentId}' and
'post:{postId}' (if the file is attached to a post) channels.

The `attachments` table now has a few new columns:
- `width` and `height`: size of the original image or video file in pixels
(null for non-visual files)
- `duration`: duration of the video or audio in seconds (null for non-playable
files)
- `previews`: JSON object with preview types and sizes, see the
_MediaPreviews_ type in the
[app/support/media-files/types.ts](app/support/media-files/types.ts) file.
- `meta`: JSON object with temporary or not essential media metadata. It can
contain the audio/video title and author name (in 'dc:title' and
'dc:creator' fields, respectively) and some special flags:
- `animatedImage`: true if the video was created from an animated image
- `silent`: true if the video has no audio track
- `inProgress`: true if the media file is currently being processed
### Added
- The new V4 API version is introduced, to support the new attachment features.
See the new serialized attachment type `SerializedAttachmentV4` in the
[app/serializers/v2/attachment.ts](app/serializers/v2/attachment.ts) file.
- The new `GET /vN/attachments/:attId` API endpoint returns the attachment by
its ID.
- Allow to limit the number of simultaneous executions for some job types.

The JobManager now has a `limitedJobs` parameter of type `Record<string,
number>`, that defines the maximum number of simultaneous executions for each
job of given type (name). Other jobs, that are not listed in `limitedJobs` are
executed without limits.

## [2.22.5] - 2025-01-05
### Added
Expand Down
6 changes: 3 additions & 3 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@ FROM node:18-bookworm

RUN apt-get update && \
apt-get install -y \
graphicsmagick \
imagemagick \
ffmpeg \
g++ \
git \
make
Expand All @@ -12,8 +13,7 @@ WORKDIR /server

RUN rm -rf node_modules && \
rm -f log/*.log && \
mkdir -p ./public/files/attachments/thumbnails && \
mkdir -p ./public/files/attachments/thumbnails2 && \
mkdir -p ./public/files/attachments && \
yarn install

ENV NODE_ENV production
Expand Down
6 changes: 5 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,11 @@ mkdir ./public/files/attachments/thumbnails/ && mkdir ./public/files/attachments
```
mkdir -p /tmp/pepyatka-media/attachments/thumbnails
mkdir -p /tmp/pepyatka-media/attachments/thumbnails2
mkdir -p /tmp/pepyatka-media/attachments/anotherTestSize
mkdir -p /tmp/pepyatka-media/attachments/p1
mkdir -p /tmp/pepyatka-media/attachments/p2
mkdir -p /tmp/pepyatka-media/attachments/p3
mkdir -p /tmp/pepyatka-media/attachments/p4
mkdir -p /tmp/pepyatka-media/attachments/a1
```

3. Create config `config/local.json` with some random secret string: `{ "secret": "myverysecretstring" }`.
Expand Down
3 changes: 2 additions & 1 deletion app/api-versions.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
export const API_VERSION_2 = 2;
export const API_VERSION_3 = 3;
export const API_VERSION_4 = 4;

export const API_VERSION_ACTUAL = API_VERSION_3;
export const API_VERSION_ACTUAL = API_VERSION_4;
export const API_VERSION_MINIMAL = API_VERSION_2;
43 changes: 32 additions & 11 deletions app/controllers/api/v1/AttachmentsController.js
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,16 @@ import createDebug from 'debug';
import compose from 'koa-compose';
import { isInt } from 'validator';

import { reportError, BadRequestException, ValidationException } from '../../../support/exceptions';
import { serializeAttachment } from '../../../serializers/v2/post';
import {
reportError,
BadRequestException,
ValidationException,
NotFoundException,
} from '../../../support/exceptions';
import { serializeAttachment } from '../../../serializers/v2/attachment';
import { serializeUsersByIds } from '../../../serializers/v2/user';
import { authRequired } from '../../middlewares';
import { dbAdapter } from '../../../models';
import { dbAdapter, Attachment } from '../../../models';
import { startAttachmentsSanitizeJob } from '../../../jobs/attachments-sanitize';

export default class AttachmentsController {
Expand All @@ -23,20 +28,17 @@ export default class AttachmentsController {
async (ctx) => {
// Accept one file-type field with any name
const [file] = Object.values(ctx.request.files || []);
const { user } = ctx.state;
const { user, apiVersion } = ctx.state;

if (!file) {
throw new BadRequestException('No file provided');
}

try {
const newAttachment = await user.newAttachment({
file: { ...file, path: file.filepath, name: file.originalFilename },
});
await newAttachment.create();
const newAttachment = await Attachment.create(file.filepath, file.originalFilename, user);

ctx.body = {
attachments: serializeAttachment(newAttachment),
attachments: serializeAttachment(newAttachment, apiVersion),
users: await serializeUsersByIds([newAttachment.userId], user.id),
};
} catch (e) {
Expand Down Expand Up @@ -64,7 +66,7 @@ export default class AttachmentsController {
my = compose([
authRequired(),
async (ctx) => {
const { user } = ctx.state;
const { user, apiVersion } = ctx.state;
const { limit: qLimit, page: qPage } = ctx.request.query;

const DEFAULT_LIMIT = 30;
Expand Down Expand Up @@ -106,7 +108,7 @@ export default class AttachmentsController {
}

ctx.body = {
attachments: attachments.map(serializeAttachment),
attachments: attachments.map((a) => serializeAttachment(a, apiVersion)),
users: await serializeUsersByIds([user.id], user.id),
hasMore,
};
Expand Down Expand Up @@ -138,4 +140,23 @@ export default class AttachmentsController {
};
},
]);

async getById(ctx) {
const { attId } = ctx.params;
const { user, apiVersion } = ctx.state;

const attachment = await dbAdapter.getAttachmentById(attId);

if (!attachment) {
throw new NotFoundException('Attachment not found');
}

const serAttachment = serializeAttachment(attachment, apiVersion);
const users = await serializeUsersByIds([attachment.userId], user?.id);

ctx.body = {
attachments: serAttachment,
users,
};
}
}
6 changes: 2 additions & 4 deletions app/controllers/api/v1/BookmarkletController.js
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import compose from 'koa-compose';

import { Post, Comment, AppTokenV1 } from '../../../models';
import { Post, Comment, AppTokenV1, Attachment } from '../../../models';
import { ForbiddenException } from '../../../support/exceptions';
import { authRequired, monitored, inputSchemaRequired } from '../../middlewares';
import { show as showPost } from '../v2/PostsController';
Expand Down Expand Up @@ -79,9 +79,7 @@ async function createAttachment(author, imageURL) {
throw new Error(`Unsupported content type: '${file.type}'`);
}

const newAttachment = author.newAttachment({ file });
await newAttachment.create();

const newAttachment = await Attachment.create(file.path, file.name, author);
return newAttachment.id;
} catch (e) {
await file.unlink();
Expand Down
32 changes: 17 additions & 15 deletions app/controllers/api/v2/PostsController.js
Original file line number Diff line number Diff line change
Expand Up @@ -64,26 +64,28 @@ export const opengraph = compose([

if (attachments.length > 0) {
for (const item of attachments) {
if (item.mediaType === 'image') {
let image_size;

if (item.previews.image) {
// Image fallback: thumbnail 2 (t2) => thumbnail (t) => original (o) => none
// Posts created in older versions of FreeFeed had only one thumbnail (t)
if (`t2` in item.imageSizes) {
image_size = `t2`; // Use high-res thumbnail
image = item.imageSizes[image_size].url;
} else if (`t` in item.imageSizes) {
image_size = `t`; // Use thumbnail
image = item.thumbnailUrl;
} else if (`o` in item.imageSizes) {
image_size = `o`; // Use original image if there are no thumbnails present
image = item.url;
let variant = null;

if ('thumbnails2' in item.previews.image) {
variant = 'thumbnails2';
} else if ('thumbnails' in item.previews.image) {
variant = 'thumbnails';
} else {
break;
// Looking for maximum size
variant = item.maxSizedVariant('image');
}

if (!variant) {
continue;
}

image_h = item.imageSizes[image_size].h;
image_w = item.imageSizes[image_size].w;
const p = item.previews.image[variant];
image = item.getFileUrl(variant);
image_h = p.h;
image_w = p.w;
break;
}
}
Expand Down
4 changes: 3 additions & 1 deletion app/freefeed-app.ts
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,9 @@ class FreefeedApp extends Application<DefaultState, AppContext> {
this.use(responseTime());
this.use(koaServerTiming());

this.use(koaStatic(`${__dirname}/../${config.attachments.storage.rootDir}`));
if (config.attachments.storage.type === 'fs') {
this.use(koaStatic(`${__dirname}/../${config.attachments.storage.rootDir}`));
}

this.use(maintenanceCheck);

Expand Down
60 changes: 60 additions & 0 deletions app/jobs/attachment-prepare-video.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
import { setInterval } from 'timers/promises';

import createDebug from 'debug';

import { dbAdapter, Job, JobManager } from '../models';
import { UUID } from '../support/types';

type Payload = { filePath: string; attId: UUID };

const debug = createDebug('freefeed:model:attachment');

export const ATTACHMENT_PREPARE_VIDEO = 'ATTACHMENT_PREPARE_VIDEO';

export async function createPrepareVideoJob(payload: Payload): Promise<void> {
await Job.create(ATTACHMENT_PREPARE_VIDEO, payload, { uniqKey: payload.attId });
}

const refreshInterval = 60; // sec

export function initHandlers(jobManager: JobManager) {
// Allow only one job at a time
jobManager.limitedJobs[ATTACHMENT_PREPARE_VIDEO] = 1;

jobManager.on(ATTACHMENT_PREPARE_VIDEO, async (job: Job<Payload>) => {
const { filePath, attId } = job.payload;
const att = await dbAdapter.getAttachmentById(attId);

if (!att) {
debug(`${ATTACHMENT_PREPARE_VIDEO}: the attachment ${attId} does not exist`);
return;
}

if (!att.meta.inProgress) {
debug(`${ATTACHMENT_PREPARE_VIDEO}: the attachment ${attId} is already processed`);
return;
}

const abortController = new AbortController();

try {
await Promise.race([
att.finalizeCreation(filePath),
// The _finalizeCreation_ can take a long time, so keep the job locked
// and re-lock it every _refreshInterval_
keepJobLocked(job, refreshInterval, abortController.signal),
]);
} finally {
abortController.abort(); // Stop the refresh timer
}
});
}

async function keepJobLocked(job: Job, interval: number, abortSignal: AbortSignal): Promise<void> {
await job.setUnlockAt(refreshInterval * 1.5);

// eslint-disable-next-line @typescript-eslint/no-unused-vars
for await (const _ of setInterval(interval, null, { signal: abortSignal })) {
await job.setUnlockAt(interval * 1.5);
}
}
Loading
Loading