Skip to content

Conversation

@wisplite
Copy link
Owner

@wisplite wisplite commented Sep 9, 2025

The current engine, especially on lower-end machines, is extremely slow. This is fine for networks on the smaller end, but it can be a big problem for large networks. One of my machines can only handle ~100-150 nodes before it becomes unusable, while the other can handle ~1000 nodes before it starts struggling.

This rewrite will introduce many optimizations that make large-scale networks more efficient. This will include:

  • Grid-based neighbor search (what I'm currently working on)
    • Currently every node checks every other node every 10ms to check connection status. This is extremely inefficient, and causes many performance issues with big networks. The grid-based system should reduce this from O(n^2) time to ~O(n*k) time, which is a huge improvement.
  • Event-driven connection updates, rather than checking at a fixed interval
  • Any potentially blocking tasks will use requestAnimationFrame to not block the rendering for seconds at a time
  • Update arrays in the nodeTable to use Set, for O(1) membership checks.
  • Get rid of the animate on edges engine and write a custom renderer that renders outside of vis-networks.
  • Precompute LUTs for color changes so the computer doesn't have to calculate colors every single time they update.
  • Batch color updates where possible to reduce duplicate calls
  • Get rid of the terrible markNeighborsAsFailed method and replace it with one that relies on rAF.
  • Debounce some rendering updates to reduce performance impact of dragging and zooming in the sim
  • Better handling of packet parsing so I don't run .split repeatedly for every packet
  • Cache current color of node to prevent redundant vis updates.

@wisplite
Copy link
Owner Author

wisplite commented Sep 10, 2025

Huge performance update! It took a lot of work, but now the sim handles connection updates much more cleanly. Here's what I did:

The problem

Previously, the sim was on a setInterval, and would recompute all node connections on every interval. Additionally, it would also check every node against every node, which is O(n^2) time, meaning the processing time increases at the square of the number of nodes (very bad). This would cause large networks to lag every interval, which led to an extremely poor experience trying to work with large networks.

2025-09-09.23-57-12.trimmed.mp4

Notice the lag spikes on every interval, especially when adding in more nodes? Yeah, there's no reason that should be happening. My hacky workaround was to make it update less often when the network became a certain size (freezing is worse than lag spikes), but that was never a real solution, only a quick bandaid fix to make the sim usable at all.

The solution

First of all, I got rid of the dumb update interval. That was never a great way of handling things, when the network is static it shouldn't need to waste time computing anything. Now all updates are event-driven, meaning updates are only triggered when the network is actually changing (node added/removed, node dragged).

Second of all, it is no longer O(n^2) time, it is now ~O(nk) time. I gained this performance boost by creating a discrete lookup table for node neighbors. Essentially, every node is tracked within a 2D grid. Upon an update, rather than checking it's current position against every other node's position to make sure it's within range, it instead asks the grid for every node within the cells directly around it (3x3 grid radius currently) and then loops over those to filter out anything within range. This is where the O(nk) estimate comes from. k, in this case, refers to the average number of nodes within the 3x3 area. As long as you aren't packing hundreds of nodes into a single connection radius, this should now be significantly faster.

In other words, it's essentially pre-computing what nodes are around it, so when it comes time to check more precisely for connections, it has to check against significantly fewer nodes.

Here's a demo video, with fancy visualizations! (ignore the lagging, as you can see when I turn them off, it's caused by the visualizations and not the connection computing itself).

2025-09-10.00-00-46.trimmed.mp4

I haven't tested this with huge networks yet, as the generate network system doesn't work anymore (event-driven updates are incompatible with the physics-based randomization thing, I'm working on a better generator as we speak), but there is no reason this won't scale to massive networks.

The third and final optimization, was not using AI (agents). My experience with trying to use AI in this project has been absolutely terrible. Many of the reasons why the v1 engine was so laggy was because AI would write something in a terrible way that would cause massive performance issues, and write an entire section of the engine around that, forcing me to rewrite it if I want to fix it (hence this whole draft PR). Sure, it worked, but it was nowhere near optimal. Cursor Tab also kept completely misunderstanding what was happening almost every single time it came up, and would instead hijack my intellisense and write completely broken, meaningless code where I just wanted to finish a conditional statement.

As a result of the chaos being caused by AI, I have completely disabled Cursor Tab (and don't plan on re-enabling it unless they make it work properly), and I only use ask mode, so I can have a chat with it before it decides to go off and rewrite half of my logic in a terrible way. The effects of this change are already showing: the code is already significantly more readable and simpler. For example, the old connection compute function calculated the score for every node twice, and did a bunch of weird mid-loop mutations of connections. Why? AI.

I didn't even mention this as an optimization above, but the connection decision function now does ~half the work and gets the same result. Scores are only calculated once, there are no in-loop mutations, arrays were replaced with Sets where applicable, etc.

So, I guess that long rant was just a way of me saying that I am done using AI agents in my code (at least for more complicated tasks like this). I'll still use it as a resource, but it will no longer directly modify my code, it causes over-reliance and tons of bugs/performance issues that I'll have to patch out later anyway.

@wisplite
Copy link
Owner Author

Okay, a lot of stuff has been updated.

  1. Network generation is no longer physics-based. Instead it tries to distribute the nodes semi-randomly following a set density. This, combined with some optimized code for adding the nodes in batches, means that instead of the system crippling at ~200 nodes, you can now create 5000+ node networks in <1s!
  2. Quick packet flooding has been optimized to an insane degree. Previously it would start freezing at ~500 nodes, and marking the unreachable nodes (outside of TTL) was extremely slow, it could take minutes of watching it animate slowly. Now Quick Flood can flood a 5000+ node network in ~1s, including marking every single failed node!
  3. The animated Send Flood function has also been heavily optimized. It was previously incredibly slow because the animate on edges library I was using was designed for small networks, and took the shortcut of creating more vis-networks nodes for each animated node. This meant that vis-networks had to recompute the whole network every frame. I've rewritten the engine to draw directly on the canvas, meaning it completely bypasses vis-networks. I've also updated it to use the same mark unreachable algorithm as Quick Flood.
  4. Creating routing tables is significantly faster. I no longer visualize as it runs, and it relies on animation frames to not block the main thread. I'm sure I could optimize it further, but it's feasible for multi-thousand node networks now, so I'm not going to worry about it too much for now.

Along with those big 4 updates, there are a lot of smaller improvements. Virtually nothing blocks the main thread, so dragging and zooming should be smooth even if you're computing a massive update. The one exception is marking unreachable nodes, I decided that it was fine for it to freeze a bit while computing (usually ~1s max) to drastically speed up the time it takes to complete (it could take multiple seconds if I don't let it block the render thread).

I've also batched as many vis-network updates as possible, so rather than looping through and updating nodes individually, things like the reset color button will update everything at once. In practice, this means that rather than slowing down for tens or hundreds of seconds, it is nearly instant.

With this, the v2 engine is nearly ready to merge! I'm just going to fix a couple tiny issues I noticed last second and I'll be done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants