Skip to content

Latest commit

 

History

History
86 lines (55 loc) · 6.14 KB

File metadata and controls

86 lines (55 loc) · 6.14 KB

Spectral Modular Deep Learning

Check this out:

In the LLM world there is a concept of model merging/weights merging where if youve trained two adapter layer LoRas, you can combine the two and produce a model that performs better than either LoRa does independently

This core concept of composable models which have been trained individually to reduce loss for specific tasks: what if those core representations were learned features captured by eigenvector bases, and then post composition, a routine of eigenvalue decomposition occured

For instance, in CV, or in conjunction with a KG, unique concepts like cat and dog could be learned to be represented by a series fo features represented as these spectral bases, and then composed modularly depending on the ultimate desired task

Key Points

  • Composable modules trained to specialize in distinct concepts (e.g. "cat" basis, "dog" basis)
  • Each module learns features mapped to eigenvector bases (learned subspaces)
  • Post-merge: apply eigenvalue/eigenvector optimization (spectral alignment or rebalancing) to fine-tune or recombine the spectral characteristics of the feature spaces

Normal model merging happens in the raw weight space - what if we merged at the subspace level, where eigenvectors (bases) of featur extractors/modules are combined and reweighted via eigenvalues after composition. These modules would be treated as orthogonal or partially overlapping subspaces of the full feature space

The act of merging would be projecting into a joint spectral basis, followed by eigenvalue optimization to adjust how much each basis "contributes"

Key Outcomes

  • Dynamic model composition at runtime, or during continual learning
  • Semantic aware feature reuse (cross modality)
  • Interpretibility - clear understandin of which subspace contributes what

Random cool words

  • spectral transformers
  • spectral CNNs, where entire layers are modularized into composable eigen-bases

Research Statement

We propose a modular deep learning framework where individual models or sub-networks are trained to capture distinct concepts as learned spectral bases. By decomposing these modules into their eigenvector-eigenvalue representations, we enable composable systems that merge feature subspaces and optimize spectral characteristics post-integration. Proposed Workflow:

  • Train independent modules (e.g., LoRA adapters or task-specific networks) to specialize on distinct concepts or tasks.
  • Represent each module’s learned features via spectral decomposition (e.g., eigenvectors as bases, eigenvalues as scaling factors).
  • Merge models by combining their spectral subspaces and constructing a joint representation.
  • Apply eigenvalue optimization or reweighting post-composition to fine-tune the balance between merged subspaces.
  • Deploy the composed model to dynamically adapt to downstream tasks while retaining interpretable modular structure.

This approach could enable more transparent and efficient model merging strategies for vision, NLP, and graph-based learning tasks. By leveraging spectral properties, we aim to enhance composability, interpretability, and task-specific adaptability in modular deep learning.

Keywords

  • Spectral modularity
  • Eigenbasis composition
  • Subspace fusion
  • Composable subspaces
  • Spectral optimization
  • Eigenvalue reweighting
  • Task-specialized spectral modules
  • Feature subspace blending
  • Spectral-aware model merging
  • Spectral interpretability in deep learning
  • Modular spectral learning
  • Eigenvector-informed model composition
  • Basis-level representation learning
  • Spectral post-processing after model merge

Extensions

So like considering collaborative autonomous decision making. What if we baked all these different composable units into self sustaining deals, and then overlayed a layer of FFDM to reach consensus. Thinking a lot about like why cloud compute is amazing and why toyotas are so OP and its because we have tons of really strong little units, but how do we orchustrate and coordinate these little units? layers of decision making with bits and pieces of the overall algorithm distributed across each unit - no single point of failure, no monolithic highly complex over-engineered (over-learned) machine

Maybe too far out, but what if the idea of neural networks mimicing neurons is wrong, well currently wrong. what if each neuron did a lot more than we think it does? what if a single NN was a single neuron, and then we had clusters of these similar things composing regions of the brain or whatever. instead of a single network for a region of the brain, we had networks of "neuron" NNs to create that region

distributed intelligence - like bugs, but we dont have a single particle representing a unit in the swarm - we have a fully compartmentalized intelligent system, forreal forreal

More Succinct description

yeah so geometric deep learning + manifold optimization leading into hyperbolic representation learning is a thread I really want to pull (in addition to the spectral modular deep learning concept we discussed in another thread (basically - extract feature level subspace basis from something like imagenet or any other model learning in a different modality like text, and then optimize eigenvalues as learned parameters to combine the two representations resulting in a unified representation space spanning both modalities, allowing you to pick and choose relevant features for lightweight, task specific models gleaned from "mothership" models trained at huge scale))

Sort of related idea

So like when learn representations based on free text, theres not any signal indicating how "close" two tokens are

So like if we just took wiki or whatever, what if we included a signal that was like "number of hops" between pages to get from one topic to the next?

So like sports -> soccer would be 1 hops

But like germany -> beer -> hops -> yeast would be 4 hops

I also wonder if there is a way to encode some causal structure... maybe like graph adjacency with a weight on each edge intersection

I will die on this hill: the ability for the system to learn its own representations, conditioned by human input (prior to training, or as part of ongoing feedback cycle) is critical for evolution