Improve popup performance #1487

djahandarie · 2024-10-13T12:06:13Z

Quite a bit of popup CPU usage comes from the following things:

Constant relayouts due to (intentional) progressive loading of entries within the popup
Media loading

The first is resolved easily by introducing content-visibility: auto, at the downside of the scrollbar being less accurate but I think it's the right tradeoff.

The second is resolved by the rest of the PR, which is insane and probably impossible to understand.

Changes to media loading

Previous to this PR, media requests were sent one-by-one from the popup to the backend, which would generate a data URL that would be sent back to the popup and inserted in an image tag.

I considered another approach, which batches all the image requests for a given entry, prepares canvas elements, and sends OffscreenCanvases to the backend, which then fetches the unique set of images for that entry and draws the appropriate ones onto the appropriate canvases.

This helps quite a bit since we can avoid creating Blobs and Data URIs, and also avoid duplicate requests to IndexedDB. IndexedDB requests are very slow here.

However, Firefox can't do cross-process transfers of OffscreenCanvases, so I settled on an architecture of having a worker within the popup which does drawing onto the OffscreenCanvases, and the backend worker transfers ArrayBuffers to the popup worker for drawing (because Firefox can do cross-process transfers of those).

Changes to communications

Unfortunately, implementing the above was not so straightforward, as we primarily use chrome.runtime.sendMessage which does not allow for transferring objects and serializes everything, making the postMessage transfers impossible.

So, I introduced a couple new communication channels:

The popup now does a serviceWorker postMessage to the active SW (on Chrome), or establishes a MessageChannel using a temporary SharedWorker and postMessages over that (on Firefox)
(On chrome) The offscreen now registers a MessageChannel with the SW, so the SW can postMessage over that channel to the offscreen. The offscreen watches for SW changes and reregisters so the SW will always have a channel for it.

This builds a full postMessage channel all the way from the popup to the backend (offscreen in case of Chrome), which allows for building further MessageChannels between popup and various backend workers (dictionary worker in the case of this PR). And with postMessage comes the ability to transfer things, which is critical for performance.

Potential improvements

Right now the way dictionary-database builds its worker is a bit ugly / overspecific to drawMedia. In the future we should probably consider some more generic architecture, where we have pools of dictionary workers, which can handle whatever dictionary-related work we want, and nothing DB-related is done in the main backend thread

MarvNC · 2024-10-13T18:01:55Z

I tried it and it seems a little better with 96 dictionaries enabled.

MarvNC · 2024-10-25T17:52:29Z

Tried the latest changes and it feels like a dramatic improvement. Great work!

koiyakiya · 2024-10-25T18:59:13Z

when i try to build this for chrome and import a dictionary collection file, it completely breaks yomitan and spits out a ton of errors, animations are broken etc

not sure if I'm just doing something wrong or there's a genuine problem

4890A · 2024-10-26T05:34:26Z

Same issue with loading dict collections or importing single dicts on the PR. (No response/animations after selecting a file). Loading recommended dicts still works however. I was able to test the PR by loading my dict collection with the main dev branch first and then copying over the files from this PR. The performance boost is significant for me with 35 imported dictionaries.

djahandarie · 2024-10-26T06:14:02Z

Yeah there's almost certainly a problem, I forget that the importer also runs some parts of this DB code in a worker and I changed some of the DB code to behave differently based on whether it's executing in a worker or not and neglected to check the behavior of the import code. I'll take a look. Thanks for testing!

MarvNC · 2024-10-27T02:02:51Z

Some images seem to be broken (pixiv png and kanji png here)

djahandarie · 2024-10-27T02:06:38Z

Thanks, is that the entry for 矜持？

MarvNC · 2024-10-27T02:14:33Z

yep

MarvNC · 2024-10-27T02:15:51Z

actually all images are broken so maybe it's just on my end.

i just loaded this PR on top of my existing yomitan install.

4890A · 2024-10-27T02:24:07Z

Seems like images break after a certain point. After restarting my browser pictures work, scan a bit and they dissapear again.

djahandarie · 2024-10-27T02:26:20Z

Can you guys open the service worker, as well as offscreen inspector tabs from the extension details page, and let me know if there are errors in either of those consoles?

MarvNC · 2024-10-27T02:28:05Z

no console errors present in either.

djahandarie · 2024-10-27T02:33:58Z

How about immediately after restarting the extension and doing the first lookup? It must be breaking somewhere 🤔

4890A · 2024-10-27T02:35:09Z

Didn't see any console errors while it was broken, and now after restarting I can't reproduce the issue.

MarvNC · 2024-11-24T06:40:53Z

Just to leave a record, I've been using d3f1317 on Edge for the past few weeks and haven't had any issues. Although I haven't tried dictionary collection import which 4890 reported was broken.

…erization, because that needs to be on the main thread...)

…d more logging

* Safe changes * Fix some types * Fix some types * Remove api-map * Remove deps * remove wasm-unsafe-eval * Revert manifest variants * Add thing * Types * Revert db change

codspeed-hq · 2024-12-17T02:39:02Z

CodSpeed Performance Report

Merging #1487 will improve performances by 42.01%

_{Comparing performance-canvas (3622197) with master (b39bd51)}

Summary

⚡ 1 improvements
✅ 2 untouched benchmarks

Benchmarks breakdown

	Benchmark	`master`	`performance-canvas`	Change
⚡	`Translator.prototype.findKanji - (n=3)`	3.9 ms	2.8 ms	+42.01%

jamesmaa · 2024-12-17T07:17:31Z

Dumb questions about this PR because I haven't really looked at it until today

Does "media loading" refer to when dictionaries have images? It seems like the performance enhancements improve general performance, or is that an edge case?
Can you explain at a high level what the CSS changes are and how they're related to the performance work? I'm not understanding the connection here
Do you have a recommended order of files to read in order to better understand this PR?

jamesmaa

Btw you also have some leftover console.log and commented out code that needs cleaning up. Very cool stuff I'm seeing here and excited to try to get this out the door.

I'm not done with my review I'm just doing a hard cutoff for bed without being completely blocked on me

ext/js/dictionary/dictionary-database.js

jamesmaa · 2024-12-17T07:36:01Z

ext/js/dictionary/dictionary-database.js

        );
+
+        // when we are not a worker ourselves, create a worker which is basically just a wrapper around this class, which we can use to offload some functions to


Is "not a worker" equivalent to saying "main thread" or "main process"?

Unfortunately, not exactly. On Chrome, we expect this to be running in the offscreen context, and in Firefox we expect it to be in the background page context. In theory it could also run in the SW context in Chrome though we don't do that currently since SW is short-lived and we want to keep the DB open longer. These contexts are neither threads or processes, as threads and processes are implementation details of the browser. (e.g., in Chrome each extension runs in its own process, and I'm not sure if SW and offscreen are in different processes, while on Firefox all extensions run in a single process, but any extension resources which are loaded as a child of a non-extension resource seems to be in the normal webpage process).

So "not a worker" is the most technically accurate way to capture all the possible situations here.

I had to educate myself via ChatGPT. lmk if this explanation of the difference between offscreen, background, and sw context is wrong

https://chatgpt.com/share/67627dc8-294c-8002-a0a6-bb1344b88a74

ext/js/dictionary/dictionary-database.js

jamesmaa · 2024-12-17T07:40:49Z

ext/js/dictionary/dictionary-database.js

+        // performance.mark('drawMedia:findMultiBulk:end');
+        // performance.measure('drawMedia:findMultiBulk', 'drawMedia:findMultiBulk:start', 'drawMedia:findMultiBulk:end');
+
+        // move all svgs to front to have a hotter loop


What does "hotter loop" mean?

CPUs do stuff like branch prediction and caching, so if you can do the same sort of processing all at the same time instead of interleaving different types of processing, you can get some performance benefits since the same branch gets executed over and over, and all necessary information is in your CPU caches. It's basically getting the benefits of batching at the micro scale.

ext/js/dictionary/dictionary-database.js

jamesmaa · 2024-12-17T07:57:41Z

ext/js/comm/shared-worker-bridge.js

+ *
+ * # On application startup
+ *  application: create a new MessageChannel, bind event listeners to one of the ports, and send the other port to the bridge
+ *    ↓↓<"connectToBackend1" via SharedWorker.port.postMessage>↓↓


connectToBackend1 and connectToBackend2 seem like two different logic

connectToBackend1 is an API call that responds to a message from the application
connectToBackend2 is logic on the backend to store the port as state

Am I understanding this correctly?

Mmm, that's not how I would describe it. Both of the messages serve just to pass a MessagePort from the application to the backend. The shared worker in the middle does nothing except just pass along the port. The backend stores the port.

jamesmaa · 2024-12-17T07:59:03Z

ext/js/comm/shared-worker-bridge.js

+    /** @type {import('shared-worker').ApiHandler<'connectToBackend1'>} */
+    _onConnectToBackend1(_params, _interlocutorPort, ports) {
+        if (this._backendPort !== null) {
+            this._backendPort.postMessage(void 0, [ports[0]]); // connectToBackend2


Instead of void 0, could we use an event type or some other data to indicate it's a connectToBackend2 message? Right now connectToBackend2's existence is only implied via comments

I originally was doing that, but basically either you choose the fully typesafe API style of doing things like I did everywhere else, which requires a ton of extra code (like maybe 50~100 lines), but it seemed like overkill when there is only a single type of message which gets sent over this channel which has no (normal) parameters, so I decided to simplify it, since I can't imagine this ever getting more message types in the future (since the whole purpose of it is to establish a channel that you can use to send properly typed messages of any kind that you want). If you think it's too confusing I guess we can introduce the full API boilerplate, but we have a lot of different APIs due to this PR and I think introducing yet another one might be bad unless we figure out some better way to name them to make it clearer what's going on.

djahandarie · 2024-12-17T11:51:21Z

@jamesmaa

Does "media loading" refer to when dictionaries have images? It seems like the performance enhancements improve general performance, or is that an edge case?

Yes, that's what it refers to. The performance enhancements improve general performance because almost all entries from recently created Yomitan dictionaries involve a high number of images, due to many dictionaries using tons of tiny SVGs for all sorts of different things (pitch accent markers, numbers, symbols, section tags, etc).

Can you explain at a high level what the CSS changes are and how they're related to the performance work? I'm not understanding the connection here

One CSS change (content-visibility: auto) directly improves the performance as I mentioned in the PR body, but the rest are just simplifications to match the new DOM which simply has a single canvas element now, instead of multiple layers deep of various DOM elements that we had previously.

Do you have a recommended order of files to read in order to better understand this PR?

Mmm, certainly ext/js/* changes are way more important than anything else. Within that folder, I guess something like this:

To understand the communication changes, application.js, comm/shared-worker-bridge.js, background/backend.js, background/offscreen.js
To understand the actual changes in media stuff, display/structured-content-generator.js, display/media-drawing-worker.js, dictionary/dictionary-database.js

and then the rest as-needed in whatever order you want.

…ance-canvas # Conflicts: # ext/js/display/display.js

Kuuuube · 2024-12-17T19:52:48Z

Added some conflicts to this PR in #1677, you can merge this into the PR: #1678.

# This targets the branch of #1487 Fixes merge conflicts from #1677 History here looks a little funky but when it's actually merged into the PR only 65ec09d and e80b3c3 should show since master already has the others.

jamesmaa

Cinderella moment ending my review at my local midnight. Mostly did a high level pass through.

jamesmaa · 2024-12-18T07:05:15Z

ext/js/application.js

+        // which we can use to postMessage with the backend.
+        // This can only be done in the extension context (aka iframe within popup),
+        // not in the content script context.
+        const backendPort = (!('serviceWorker' in navigator)) && window.location.protocol === new URL(import.meta.url).protocol ?


Suggested change

const backendPort = (!('serviceWorker' in navigator)) && window.location.protocol === new URL(import.meta.url).protocol ?

const isFirefox = (!('serviceWorker' in navigator)) && window.location.protocol === new URL(import.meta.url).protocol;

const backendPort = isFirefox ?

Maybe pull out the check into a labeled variable for readability? I'm not super following what the logic check is doing except by relying on the comment above. Is that what this check does?

jamesmaa · 2024-12-18T07:10:25Z

ext/js/application.js

@@ -190,10 +190,35 @@ export class Application extends EventDispatcher {
     * @param {(application: Application) => (Promise<void>)} mainFunction
     */
    static async main(waitForDom, mainFunction) {
+        /** @type {MessagePort | null} */
+        // If this is Firefox, we don't have a service worker and can't postMessage,


"we" here refers to the application a.k.a the iframe within the popup?

can't postMessage

How is it that we postMessage to the sharedWorkerBridge but not directly to the backend?

jamesmaa · 2024-12-18T07:12:47Z

ext/js/application.js

In terminology, is my understanding correct?

"application context" refers to the iframe that popups during scan and this file contains the relevant logic
"content script context" is the code that runs on each web page irrespective of whether a scan occurs.

jamesmaa · 2024-12-18T07:13:43Z

ext/js/application.js

+                const sharedWorkerBridge = new SharedWorker(new URL('comm/shared-worker-bridge.js', import.meta.url), {type: 'module'});
+                const backendChannel = new MessageChannel();
+                sharedWorkerBridge.port.postMessage({action: 'connectToBackend1'}, [backendChannel.port1]);
+                sharedWorkerBridge.port.close();


Is the understanding here that the application and backend have now established a connection and this sharedWorkerBridge is no longer needed?

jamesmaa · 2024-12-18T07:15:28Z

ext/js/application.js

-        const api = new API(webExtension);
+
+        const mediaDrawingWorkerToBackendChannel = new MessageChannel();
+        const mediaDrawingWorker = window.location.protocol === new URL(import.meta.url).protocol ? new Worker(new URL('display/media-drawing-worker.js', import.meta.url), {type: 'module'}) : null;


const mediaDrawingWorker = window.location.protocol === new URL(import.meta.url).protocol ? new Worker(new URL('display/media-drawing-worker.js', import.meta.url), {type: 'module'}) : null;

Similarly pull out window.location.protocol === new URL(import.meta.url).protocol as a variable to make it clear what this check is doing

jamesmaa · 2024-12-18T07:42:00Z

ext/js/dictionary/dictionary-database-worker-handler.js

+            case 'connectToDatabaseWorker':
+                void this._dictionaryDatabase?.connectToDatabaseWorker(event.ports[0]);
+                break;
+        }


Add a default case for logging?

jamesmaa · 2024-12-18T07:43:39Z

ext/js/dictionary/dictionary-database-worker-main.js

@@ -0,0 +1,32 @@
+/*
+ * Copyright (C) 2023-2024  Yomitan Authors


Suggested change

* Copyright (C) 2023-2024 Yomitan Authors

* Copyright (C) 2024 Yomitan Authors

jamesmaa · 2024-12-18T07:50:34Z

ext/js/dictionary/dictionary-database.js

        );
+
+        // when we are not a worker ourselves, create a worker which is basically just a wrapper around this class, which we can use to offload some functions to


I had to educate myself via ChatGPT. lmk if this explanation of the difference between offscreen, background, and sw context is wrong

https://chatgpt.com/share/67627dc8-294c-8002-a0a6-bb1344b88a74

jamesmaa · 2024-12-18T07:57:56Z

ext/js/display/media-drawing-worker.js

+export class MediaDrawingWorker {
+    constructor() {
+        /** @type {number} */
+        this._generation = 0;


What does "generation" mean in the context of Yomitan?

jamesmaa · 2024-12-18T08:00:20Z

ext/js/display/structured-content-generator.js

+            if (sizeUnits === 'em' && (hasPreferredWidth || hasPreferredHeight)) {
+                image.style.width = `${usedWidth}em`;
+                image.style.height = `${usedWidth * invAspectRatio}em`;
+                image.width = usedWidth * 14 * 2 * window.devicePixelRatio;


14 * 2 magic numbers?

djahandarie force-pushed the performance-canvas branch from 343925d to bfae211 Compare October 13, 2024 12:11

djahandarie force-pushed the performance-canvas branch from bfae211 to 229163d Compare October 14, 2024 00:57

Kuuuube mentioned this pull request Nov 12, 2024

Switching profiles causes insane laggs and is basically unusable #1480

Open

djahandarie added 12 commits December 8, 2024 17:00

batch media lookups, eliminate base64ing

a48c49b

properly abort if lookup is interrupted

22bcad1

Begin to add performance marks

9e218ef

batch media lookups, eliminate base64ing

8f15bbc

[WIP] attempt to use OffscreenCanvas

34c5c15

Add worker to offload some stuff off the main thread (except SVG rast…

c493612

…erization, because that needs to be on the main thread...)

swap to rendering SVGs on worker thread using resvg-js/wasm

f0c469a

further tweaking, add more debug logging

a57d84d

remove some unnecessary asyncification done earlier in the PR, and ad…

33a087e

…d more logging

remove some dead code and add more logging

b740d53

make SW instruct offscreen to create port if non-existent

621d6f7

make SW instruct offscreen to create port when SW restarts

6d0d6e6

djahandarie marked this pull request as ready for review December 11, 2024 15:03

djahandarie requested a review from a team as a code owner December 11, 2024 15:03

remove unused alt constant

0956c05

djahandarie changed the title ~~[Draft] Improve popup performance~~ Improve popup performance Dec 11, 2024

djahandarie added 2 commits December 12, 2024 00:11

fix no-undef errors

a8fb1b8

fix ts errors

3c0cca9

djahandarie force-pushed the performance-canvas branch 2 times, most recently from f4a139c to 80e21c5 Compare December 13, 2024 10:59

clean up css, make tests pass

9b58a5e

djahandarie force-pushed the performance-canvas branch from 80e21c5 to 9b58a5e Compare December 13, 2024 12:24

fix benches

79527f6

djahandarie force-pushed the performance-canvas branch from 3753f95 to 704b6e9 Compare December 13, 2024 13:09

fix remaining test by doing some upgrades

b3de480

djahandarie force-pushed the performance-canvas branch from 704b6e9 to b3de480 Compare December 13, 2024 13:16

Kuuuube added kind/enhancement The issue or PR is a new feature or request area/performance The issue or PR is related to performance labels Dec 13, 2024

Merge remote-tracking branch 'origin/master' into performance-canvas

4ab4ad4

jamesmaa reviewed Dec 17, 2024

View reviewed changes

cleanup, improve clarity in some places

2c5f42d

djahandarie requested a review from jamesmaa December 17, 2024 12:15

Kuuuube added 2 commits December 17, 2024 14:43

Merge remote-tracking branch 'upstream/master' into safe-perf-perform…

65ec09d

…ance-canvas # Conflicts: # ext/js/display/display.js

Replace performance with safePerformance

e80b3c3

Kuuuube mentioned this pull request Dec 17, 2024

safePerformance for performance-canvas (#1487) #1678

Merged

safePerformance for performance-canvas (#1487) (#1678)

3622197

# This targets the branch of #1487 Fixes merge conflicts from #1677 History here looks a little funky but when it's actually merged into the PR only 65ec09d and e80b3c3 should show since master already has the others.

jamesmaa reviewed Dec 18, 2024

View reviewed changes

		);

		// when we are not a worker ourselves, create a worker which is basically just a wrapper around this class, which we can use to offload some functions to

	const backendPort = (!('serviceWorker' in navigator)) && window.location.protocol === new URL(import.meta.url).protocol ?
	const isFirefox = (!('serviceWorker' in navigator)) && window.location.protocol === new URL(import.meta.url).protocol;
	const backendPort = isFirefox ?

	* Copyright (C) 2023-2024 Yomitan Authors
	* Copyright (C) 2024 Yomitan Authors

Improve popup performance #1487

Are you sure you want to change the base?

Improve popup performance #1487

Conversation

djahandarie commented Oct 13, 2024 • edited Loading

Changes to media loading

Changes to communications

Potential improvements

MarvNC commented Oct 13, 2024

MarvNC commented Oct 25, 2024

koiyakiya commented Oct 25, 2024 • edited Loading

4890A commented Oct 26, 2024 • edited Loading

djahandarie commented Oct 26, 2024 • edited Loading

MarvNC commented Oct 27, 2024

djahandarie commented Oct 27, 2024

MarvNC commented Oct 27, 2024

MarvNC commented Oct 27, 2024

4890A commented Oct 27, 2024

djahandarie commented Oct 27, 2024

MarvNC commented Oct 27, 2024

djahandarie commented Oct 27, 2024

4890A commented Oct 27, 2024

MarvNC commented Nov 24, 2024

codspeed-hq bot commented Dec 17, 2024 • edited Loading

Merging #1487 will improve performances by 42.01%

Summary

Benchmarks breakdown

jamesmaa commented Dec 17, 2024 • edited Loading

jamesmaa left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

djahandarie commented Dec 17, 2024 • edited Loading

Kuuuube commented Dec 17, 2024 • edited Loading

jamesmaa left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

djahandarie commented Oct 13, 2024 •

edited

Loading

koiyakiya commented Oct 25, 2024 •

edited

Loading

4890A commented Oct 26, 2024 •

edited

Loading

djahandarie commented Oct 26, 2024 •

edited

Loading

codspeed-hq bot commented Dec 17, 2024 •

edited

Loading

jamesmaa commented Dec 17, 2024 •

edited

Loading

djahandarie commented Dec 17, 2024 •

edited

Loading

Kuuuube commented Dec 17, 2024 •

edited

Loading