-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #28 from ktonal/develop_v2
v0.4.1
- Loading branch information
Showing
106 changed files
with
9,253 additions
and
4,339 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
.DS_* | ||
.DS_Store | ||
*__pycache__* | ||
*pyc | ||
*.ipynb_checkpoints* | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,13 +1,40 @@ | ||
# mimikit | ||
|
||
The MusIc ModelIng toolKIT (MIMIKIT) is a python package that does Machine Learning with music data. | ||
The MusIc ModelIng toolKIT (`mimikit`) is a python package that does Machine Learning with audio data. | ||
|
||
The goal of `mimikit` is to enable you to use referenced and experimental algorithms on data you provide. | ||
Currently, it focuses on training auto-regressive neural networks to generate audio. | ||
|
||
`mimikit` is still in early development, details and documentation are on their way. | ||
but it does also contain an app to perform basic & experimental clustering of audio data in a notebook. | ||
|
||
## License | ||
## Installation | ||
|
||
you can install with pip | ||
```shell script | ||
pip install mimikit[torch] | ||
``` | ||
or with | ||
```shell script | ||
pip install --upgrade mimikit[torch] | ||
``` | ||
if you are looking for the latest version | ||
|
||
for an editable install, you'll need | ||
```shell script | ||
pip install -e . --config-settings editable_mode=compat | ||
``` | ||
|
||
## Usage | ||
|
||
Head straight to the [notebooks](https://github.com/ktonal/mimikit-notebooks) for example usage of `mimikit`, or open them directly in Colab: | ||
|
||
mimikit is distributed under the terms of the [GNU General Public License v3.0](https://choosealicense.com/licenses/gpl-3.0/) | ||
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ktonal/mimikit-notebooks/blob/main) | ||
|
||
## Output Samples | ||
|
||
You can explore the outputs of different trainings done with `mimikit` at this demo website: | ||
|
||
https://ktonal.github.io/mimikit-demo-outputs | ||
|
||
## License | ||
|
||
`mimikit` is distributed under the terms of the [GNU General Public License v3.0](https://choosealicense.com/licenses/gpl-3.0/) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,74 @@ | ||
|
||
CANNONICAL SHAPE: | ||
|
||
(BATCH, [CHANNEL], TIME, [DIM], [COMPONENT]) | ||
|
||
|
||
abs(Spectro), MelSpec, MFCC, Qt, Repr: (Batch, Time, Dim) | ||
|
||
Complex_S: (Batch, [Channel], Time, Dim, 2) | ||
|
||
y: (Batch, [Channel], Time) | ||
|
||
enveloppe: (Batch, [Channel], Time) | ||
|
||
text, lyrics, file_label, segment_label: (Batch, [Channel], Time, [Embedding, Class_Size]) | ||
|
||
pitch, speaker_id: (Batch, [Channel], Time, [Embedding, Class_Size]) | ||
|
||
qx, k_mer_hash, frame_index, cluster_label: (Batch, [Channel], Time, [Embedding, Class_Size]) | ||
|
||
y_bits: (Batch, [Channel], Time, Bit_Depth) | ||
|
||
|
||
|
||
Modules can change SHAPES through: | ||
|
||
- Project: (Batch, [Channel], Time) ----> (Batch, [Channel], Time, Dim) | ||
- Map/Transform: ANY Structure ----> SAME Structure | ||
- Predict: Any ----> TRAINING: (..., Dim) BUT INFER: (..., 1) | ||
- Fork: (...., Component) ----> (...) x Component | ||
- Join: (...) x Component ----> (...., Component) | ||
|
||
Modules can change SIGNATURES through: | ||
|
||
- Iso: N Inputs ----> N Outputs | ||
- Split: 1 Input ----> N Outputs | ||
- Reduce: N Inputs ----> 1 Outputs | ||
- Transform: N Inputs ----> M Outputs | ||
|
||
|
||
So, now, we can try to solve: | ||
|
||
REPR Shape ---> F(...) = ??? ---> Model Shape | ||
|
||
|
||
------ | ||
|
||
i.e. | ||
repr_shape=(B=-1, T=-1) | ||
model_shape=(B=-1, T=-1, D=128) | ||
---> ??? = Project | ||
|
||
repr_shape=(B=-1, T=-1, D=1025) X N | ||
model_shape=(B=-1, T=-1, D=128) | ||
---> ??? = [Reduce, Map] OR [Join, Reduce], .... | ||
**A. User says:** | ||
|
||
"I want to connect this_feat with this_network." | ||
**B. Mimikit answers:** | ||
|
||
"Then you can use this_options" | ||
---> IOConfigService.resolve(this_feat, this_network_class): -> {io_module_config, ...} | ||
|
||
**C. User chooses, configures and then clicks `Run`. mimikit goes:** | ||
|
||
"Let's connect all those tings" | ||
--> ModelInstantiator(this_feat, this_network, this_io_config) -> Model | ||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,130 @@ | ||
## Motivation | ||
|
||
We want to | ||
|
||
- easily compose ML models from *components* (inputs/outputs number and type, modules, network architecture, ...) | ||
- easily build UI interfaces to configure/instantiate them (and ideally, be able to save their state!) | ||
|
||
### Top level structure | ||
|
||
``` | ||
mimikit | ||
- config | ||
- view | ||
- checkpoint | ||
- io | ||
- audio | ||
- mu_law | ||
- spectrogram | ||
- enveloppe | ||
... | ||
- labels | ||
- cluster_label | ||
- speaker_label | ||
... | ||
- information_retrieval | ||
- segment_audio | ||
- cluster | ||
... | ||
- modules | ||
- loss_fn | ||
... | ||
- networks | ||
- io_wrappers | ||
- sample_rnn | ||
- wavenet | ||
... | ||
- models | ||
- srnn | ||
- freqnet | ||
... | ||
- loops | ||
- train_loop | ||
- generate_loop | ||
- trainings | ||
- train_arm | ||
- train_gan | ||
- train_diffusion | ||
- scripts | ||
- generate_arm | ||
- train_freqnet | ||
- ensemble | ||
- eval_arm | ||
.... | ||
- notebooks | ||
- generate_arm | ||
- train_freqnet | ||
- ensemble | ||
- explore_cluster | ||
..... | ||
``` | ||
|
||
### ML Component Design Pattern | ||
|
||
use `dataclasses` and inheritance to define & connect the layers of a 'ML component', e.g. | ||
|
||
``` | ||
@dtc.dataclass | ||
class NetConfig(Config): | ||
... | ||
@dtc.dataclass | ||
class NetImpl(NetConfig, nn.Module): | ||
def __post_init__(self): | ||
# init modules... | ||
def forward(self, inputs, **kwargs): | ||
... | ||
@dtc.dataclass | ||
class NetView(NetConfig, ConfigView): | ||
def __post_init__(self): | ||
# ... map params to widget ... | ||
ConfigView.__init__(self, **params) | ||
``` | ||
|
||
Constructors are | ||
- type-safe | ||
- consistent across layers | ||
- (de)serializable | ||
- **defined once** | ||
|
||
Then we can nest Configs, Impls & Views by doing: | ||
|
||
``` | ||
@dtc.dataclass | ||
class NestedConfig(Config): | ||
io: IOConfig | ||
net: NetConfig | ||
@dtc.dataclass | ||
class Model(NestedConfig, nn.Module): | ||
def __post_init__(self): | ||
.... | ||
@dtc.dataclass | ||
class ModelView(NestedConfig, ConfigView): | ||
..... | ||
``` | ||
|
||
--> We win | ||
- generic saving/loading Checkpoints | ||
- highly expressive composition for io, models, features, views, etc... | ||
|
||
|
||
### Implementation | ||
|
||
different libraries / base classes offer different trade-offs between ease of use and ease of integrations with other libraries. | ||
|
||
Ideally, we could, | ||
- define a constructor -> `dataclass, attrs, namedtuple` | ||
- have static type checker recognize it | ||
- attach it to a nn.Module -> inheritance?, `classmethod`?, decorator? | ||
- use it in a View -> switch mutability | ||
- (de)serialize it as config -> `OmegaConf` | ||
- be able to export the nn.Module as `TorchScript` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,21 +1,28 @@ | ||
__version__ = '0.3.4' | ||
__version__ = '0.4.1' | ||
|
||
from . import extract | ||
from . import config | ||
from . import features | ||
from . import loops | ||
from . import checkpoint | ||
from . import modules | ||
from . import extract | ||
from . import io_spec | ||
from . import models | ||
from . import networks | ||
from . import demos | ||
from . import ui | ||
from . import views | ||
|
||
from .extract import * | ||
from .checkpoint import * | ||
from .config import * | ||
from .features import * | ||
from .loops import * | ||
from .modules import * | ||
from .extract import * | ||
from .models import * | ||
from .networks import * | ||
|
||
from .train import * | ||
from .ui import * | ||
from .utils import * | ||
|
||
from .views import * | ||
from .io_spec import * | ||
from .demos import * |
Oops, something went wrong.