Tor4k
diff --git a/‎README.md
Lines changed: 55 additions & 0 deletions b/‎README.md
Lines changed: 55 additions & 0 deletions
diff --git a/‎architectures/mnist/flatten.py
Lines changed: 14 additions & 0 deletions b/‎architectures/mnist/flatten.py
Lines changed: 14 additions & 0 deletions
diff --git a/‎architectures/mnist/lift2d.py
Lines changed: 15 additions & 0 deletions b/‎architectures/mnist/lift2d.py
Lines changed: 15 additions & 0 deletions
diff --git a/‎datasets/__init__.py b/‎datasets/__init__.py
diff --git a/‎datasets/mnist.py
Lines changed: 83 additions & 0 deletions b/‎datasets/mnist.py
Lines changed: 83 additions & 0 deletions
diff --git a/‎datasets/utils_data.py
Lines changed: 24 additions & 0 deletions b/‎datasets/utils_data.py
Lines changed: 24 additions & 0 deletions
diff --git a/‎evaluate.py
Lines changed: 174 additions & 0 deletions b/‎evaluate.py
Lines changed: 174 additions & 0 deletions
@@ -0,0 +1,55 @@
+# Deep Networks from the Principle of Rate Reduction
+This repository is the official implementation of the paper [Deep Networks from the Principle of Rate Reduction](https://arxiv.org/abs/2010.14765) (2021) by [Kwan Ho Ryan Chan](https://ryanchankh.github.io)* (UC Berkeley), [Yaodong Yu](https://yaodongyu.github.io/)* (UC Berkeley), [Chong You](https://sites.google.com/view/cyou)* (UC Berkeley), [Haozhi Qi](https://haozhi.io/) (UC Berkeley), John Wright (Columbia), and Yi Ma (UC Berkeley). 
+
+## What is ReduNet?
+ReduNet is a deep neural network construcuted naturally by deriving the gradients of the Maximal Coding Rate Reduction (MCR<sup>2</sup>) [1] objective. Every layer of this network can be interpreted based on its mathematical operations and the network collectively is trained in a feed-forward manner only. In addition, by imposing shift invariant properties to our network, the convolutional operator can be derived using only the data and MCR<sup>2</sup> objective function, hence making our network design principled and interpretable. 
+
+<p align="center">
+    <img src="images/arch-redunet.jpg" width="350"\><br>
+	Figure: Weights and operations for one layer of ReduNet
+</p>
+<p align="center">
+
+[1] Yu, Yaodong, Kwan Ho Ryan Chan, Chong You, Chaobing Song, and Yi Ma. "[Learning diverse and discriminative representations via the principle of maximal coding rate reduction](https://proceedings.neurips.cc/paper/2020/file/6ad4174eba19ecb5fed17411a34ff5e6-Paper.pdf)" Advances in Neural Information Processing Systems 33 (2020). 
+
+## Requirements
+This codebase is written for `python3`. To install necessary python packages, run `conda create --name redunet_official --file requirements.txt`.
+
+## Core Usage and Design
+The design of this repository aims to be easy-to-use and easy-to-intergrate to the current framework of your experiment, as long as it uses PyTorch. The `ReduNet` object inherents from `nn.Sequential`, and layers `ReduLayers`, such as `Vector`, `Fourier1D` and `Fourier2D` inherent from `nn.Module`. Loss functions are implemented in `loss.py`. Architectures and Dataset options are located in `load.py` file. Data objects and pre-set architectures are loaded in folders `dataset` and `architectures`. Feel free to add more based on the experiments you want to run. We have provided basic experiment setups, located in `train_<mode>.py` and `evaluate_<mode>.py`, where `<mode>` is the type of experiment. For utility functions, please check out `functional.py` or `utils.py`. Feel free to email us if there are any issues or suggestions. 
+
+
+## Example: Forward Construction
+To train a ReduNet using forward construction, please checkout `train_forward.py`. For evaluating, please checkout `evaluate_forward.py`. For example, to train on 40-layer ReduNet on MNIST using 1000 samples per class, run:
+
+```
+$ python3 train_forward.py --data mnistvector --arch layers50 --samples 1000
+```
+After training, you can evaluate the trained model using `evaluate_forward.py`, by running:
+
+```
+$ python3 evaluate_forward.py --model_dir ./saved_models/forward/mnistvector+layers50/samples1000 
+```
+, which will evaluate using all available training samples and testing samples. For more training and testing options, please checkout the file `train_forward.py` and `evaluate_forward.py`.
+
+### Experiments in Paper
+For code used to generate experimental empirical results listed in our paper, please visit our other repository: [https://github.com/ryanchankh/redunet_paper](https://github.com/ryanchankh/redunet_paper)
+
+## Reference
+For technical details and full experimental results, please check the [paper](https://arxiv.org/abs/2010.14765). Please consider citing our work if you find it helpful to yours:
+
+```
+@article{chan2020deep,
+  title={Deep networks from the principle of rate reduction},
+  author={Chan, Kwan Ho Ryan and Yu, Yaodong and You, Chong and Qi, Haozhi and Wright, John and Ma, Yi},
+  journal={arXiv preprint arXiv:2010.14765},
+  year={2020}
+}
+```
+
+## License and Contributing
+- This README is formatted based on [paperswithcode](https://github.com/paperswithcode/releasing-research-code).
+- Feel free to post issues via Github. 
+
+## Contact
+Please contact [[email protected]]([email protected]) and [[email protected]]([email protected]) if you have any question on the codes.
@@ -0,0 +1,14 @@
+from redunet import *
+
+
+
+def flatten(layers, num_classes):
+    net = ReduNet(
+        *[Vector(eta=0.5, 
+                    eps=0.1, 
+                    lmbda=500, 
+                    num_classes=num_classes, 
+                    dimensions=784
+                    ) for _ in range(layers)],
+    )
+    return net
@@ -0,0 +1,15 @@
+from redunet import *
+
+
+
+def lift2d(channels, layers, num_classes, seed=0):
+    net = ReduNet(
+        Lift2D(1, channels, 9, seed=seed),
+        *[Fourier2D(eta=0.5, 
+                    eps=0.1, 
+                    lmbda=500, 
+                    num_classes=num_classes, 
+                    dimensions=(channels, 28, 28)
+                    ) for _ in range(layers)],
+    )
+    return net
@@ -0,0 +1,83 @@
+import torchvision.datasets as datasets
+import torchvision.transforms as transforms
+from torch.utils.data import DataLoader
+from .utils_data import filter_class
+
+
+
+
+
+
+def mnist2d_10class(data_dir):
+    transform = transforms.Compose([
+        transforms.ToTensor(),
+    ])
+    trainset = datasets.MNIST(data_dir, train=True, transform=transform, download=True)
+    testset = datasets.MNIST(data_dir, train=False, transform=transform, download=True)
+    num_classes = 10
+    return trainset, testset, num_classes
+
+def mnist2d_5class(data_dir):
+    transform = transforms.Compose([
+        transforms.ToTensor(),
+    ])
+    trainset = datasets.MNIST(data_dir, train=True, transform=transform, download=True)
+    testset = datasets.MNIST(data_dir, train=False, transform=transform, download=True)
+    trainset, num_classes = filter_class(trainset, [0, 1, 2, 3, 4])
+    testset, _ = filter_class(testset, [0, 1, 2, 3, 4])
+    num_classes = 5
+    return trainset, testset, num_classes
+
+def mnist2d_2class(data_dir):
+    transform = transforms.Compose([
+        transforms.ToTensor(),
+    ])
+    trainset = datasets.MNIST(data_dir, train=True, transform=transform, download=True)
+    testset = datasets.MNIST(data_dir, train=False, transform=transform, download=True)
+    trainset, num_classes = filter_class(trainset, [0, 1])
+    testset, _ = filter_class(testset, [0, 1])
+    return trainset, testset, num_classes
+
+def mnistvector_10class(data_dir):
+    transform = transforms.Compose([
+        transforms.ToTensor(),
+        transforms.Lambda(lambda x: x.flatten())
+    ])
+    trainset = datasets.MNIST(data_dir, train=True, transform=transform, download=True)
+    testset = datasets.MNIST(data_dir, train=False, transform=transform, download=True)
+    num_classes = 10
+    return trainset, testset, num_classes
+
+def mnistvector_5class(data_dir):
+    transform = transforms.Compose([
+        transforms.ToTensor(),
+        transforms.Lambda(lambda x: x.flatten())
+    ])
+    trainset = datasets.MNIST(data_dir, train=True, transform=transform, download=True)
+    testset = datasets.MNIST(data_dir, train=False, transform=transform, download=True)
+    trainset, num_classes = filter_class(trainset, [0, 1, 2, 3, 4])
+    testset, _ = filter_class(testset, [0, 1, 2, 3, 4])
+    return trainset, testset, num_classes
+
+def mnistvector_2class(data_dir):
+    transform = transforms.Compose([
+        transforms.ToTensor(),
+        transforms.Lambda(lambda x: x.flatten())
+    ])
+    trainset = datasets.MNIST(data_dir, train=True, transform=transform, download=True)
+    testset = datasets.MNIST(data_dir, train=False, transform=transform, download=True)
+    trainset, num_classes = filter_class(trainset, [0, 1])
+    testset, _ = filter_class(testset, [0, 1])
+    return trainset, testset, num_classes
+
+
+if __name__ == '__main__':
+    trainset, testset, num_classes = mnist2d_2class('./data/')
+    trainloader  = DataLoader(trainset, batch_size=trainset.data.shape[0])
+    print(trainset)
+    print(testset)
+    print(num_classes)
+
+    batch_imgs, batch_lbls = next(iter(trainloader))
+    print(batch_imgs.shape, batch_lbls.shape)
+    print(batch_lbls.unique(return_counts=True))
@@ -0,0 +1,24 @@
+import numpy as np
+import torch
+
+
+
+def filter_class(dataset, classes):
+    data, labels = dataset.data, dataset.targets
+    if type(labels) == list:
+        labels = torch.tensor(labels)
+    data_filter = []
+    labels_filter = []
+    for _class in classes:
+        idx = labels == _class
+        data_filter.append(data[idx])
+        labels_filter.append(labels[idx])
+    if type(dataset.data) == np.ndarray:
+        dataset.data = np.vstack(data_filter)
+        dataset.targets = np.hstack(labels_filter)
+    elif type(dataset.data) == torch.Tensor:
+        dataset.data = torch.cat(data_filter)
+        dataset.targets = torch.cat(labels_filter)
+    else:
+        raise TypeError('dataset.data type neither np.ndarray nor torch.Tensor')
+    return dataset, len(classes)
@@ -0,0 +1,174 @@
+import numpy as np
+import scipy.stats as sps
+import torch
+
+from sklearn.svm import LinearSVC
+from sklearn.decomposition import PCA
+from sklearn.decomposition import TruncatedSVD
+from sklearn.linear_model import SGDClassifier
+from sklearn.svm import LinearSVC, SVC
+from sklearn.tree import DecisionTreeClassifier
+from sklearn.ensemble import RandomForestClassifier
+
+import functional as F
+import utils
+
+
+
+def evaluate(eval_dir, method, train_features, train_labels, test_features, test_labels, **kwargs):
+    if method == 'svm':
+        acc_train, acc_test = svm(train_features, train_labels, test_features, test_labels)
+    elif method == 'knn':
+        acc_train, acc_test = knn(train_features, train_labels, test_features, test_labels, **kwargs)
+    elif method == 'nearsub':
+        acc_train, acc_test = nearsub(train_features, train_labels, test_features, test_labels, **kwargs)
+    elif method == 'nearsub_pca':
+        acc_train, acc_test = knn(train_features, train_labels, test_features, test_labels, **kwargs)
+    acc_dict = {'train': acc_train, 'test': acc_test}
+    utils.save_params(eval_dir, acc_dict, name=f'acc_{method}')
+
+def svm(train_features, train_labels, test_features, test_labels):
+    svm = LinearSVC(verbose=0, random_state=10)
+    svm.fit(train_features, train_labels)
+    acc_train = svm.score(train_features, train_labels)
+    acc_test = svm.score(test_features, test_labels)
+    print("SVM: {}, {}".format(acc_train, acc_test))
+    return acc_train, acc_test
+
+# def knn(train_features, train_labels, test_features, test_labels, k=5):
+#     sim_mat = train_features @ train_features.T
+#     topk = torch.from_numpy(sim_mat).topk(k=k, dim=0)
+#     topk_pred = train_labels[topk.indices]
+#     test_pred = torch.tensor(topk_pred).mode(0).values.detach()
+#     acc_train = compute_accuracy(test_pred.numpy(), train_labels)
+
+#     sim_mat = train_features @ test_features.T
+#     topk = torch.from_numpy(sim_mat).topk(k=k, dim=0)
+#     topk_pred = train_labels[topk.indices]
+#     test_pred = torch.tensor(topk_pred).mode(0).values.detach()
+#     acc_test = compute_accuracy(test_pred.numpy(), test_labels)
+#     print("kNN: {}, {}".format(acc_train, acc_test))
+#     return acc_train, acc_test
+
+def knn(train_features, train_labels, test_features, test_labels, k=5):
+    sim_mat = train_features @ train_features.T
+    topk = sim_mat.topk(k=k, dim=0)
+    topk_pred = train_labels[topk.indices]
+    test_pred = topk_pred.mode(0).values.detach()
+    acc_train = compute_accuracy(test_pred, train_labels)
+
+    sim_mat = train_features @ test_features.T
+    topk = sim_mat.topk(k=k, dim=0)
+    topk_pred = train_labels[topk.indices]
+    test_pred = topk_pred.mode(0).values.detach()
+    acc_test = compute_accuracy(test_pred, test_labels)
+    print("kNN: {}, {}".format(acc_train, acc_test))
+    return acc_train, acc_test
+
+# # TODO: 1. implement pytorch version 2. suport batches
+# def nearsub(train_features, train_labels, test_features, test_labels, num_classes, n_comp=10, return_pred=False):
+#     train_scores, test_scores = [], []
+#     classes = np.arange(num_classes)
+#     features_sort, _ = utils.sort_dataset(train_features, train_labels, 
+#                                           classes=classes, stack=False)           
+#     fd = features_sort[0].shape[1]
+#     if n_comp >= fd:
+#         n_comp = fd - 1
+#     for j in classes:
+#         svd = TruncatedSVD(n_components=n_comp).fit(features_sort[j])
+#         subspace_j = np.eye(fd) - svd.components_.T @ svd.components_
+#         train_j = subspace_j @ train_features.T
+#         test_j = subspace_j @ test_features.T
+#         train_scores_j = np.linalg.norm(train_j, ord=2, axis=0)
+#         test_scores_j = np.linalg.norm(test_j, ord=2, axis=0)
+#         train_scores.append(train_scores_j)
+#         test_scores.append(test_scores_j)
+#     train_pred = np.argmin(train_scores, axis=0)
+#     test_pred = np.argmin(test_scores, axis=0)
+#     if return_pred:
+#         return train_pred.tolist(), test_pred.tolist()
+#     train_acc = compute_accuracy(classes[train_pred], train_labels)
+#     test_acc = compute_accuracy(classes[test_pred], test_labels)
+#     print('SVD: {}, {}'.format(train_acc, test_acc))
+#     return train_acc, test_acc
+
+def nearsub(train_features, train_labels, test_features, test_labels, 
+            num_classes, n_comp=10, return_pred=False):
+    train_scores, test_scores = [], []
+    classes = np.arange(num_classes)
+    features_sort, _ = utils.sort_dataset(train_features, train_labels, 
+                                          classes=classes, stack=False)           
+    fd = features_sort[0].shape[1]
+    for j in classes:
+        _, _, V = torch.svd(features_sort[j])
+        components = V[:, :n_comp].T
+        subspace_j = torch.eye(fd) - components.T @ components
+        train_j = subspace_j @ train_features.T
+        test_j = subspace_j @ test_features.T
+        train_scores_j = torch.linalg.norm(train_j, ord=2, axis=0)
+        test_scores_j = torch.linalg.norm(test_j, ord=2, axis=0)
+        train_scores.append(train_scores_j)
+        test_scores.append(test_scores_j)
+    train_pred = torch.stack(train_scores).argmin(0)
+    test_pred = torch.stack(test_scores).argmin(0)
+    if return_pred:
+        return train_pred.numpy(), test_pred.numpy()
+    train_acc = compute_accuracy(classes[train_pred], train_labels.numpy())
+    test_acc = compute_accuracy(classes[test_pred], test_labels.numpy())
+    print('SVD: {}, {}'.format(train_acc, test_acc))
+    return train_acc, test_acc
+
+def nearsub_pca(train_features, train_labels, test_features, test_labels, num_classes, n_comp=10):
+    scores_pca = []
+    classes = np.arange(num_classes)
+    features_sort, _ = utils.sort_dataset(train_features, train_labels, classes=classes, stack=False)           
+    fd = features_sort[0].shape[1]
+    if n_comp >= fd:
+        n_comp = fd - 1
+    for j in np.arange(len(classes)):
+        pca = PCA(n_components=n_comp).fit(features_sort[j]) 
+        pca_subspace = pca.components_.T
+        mean = np.mean(features_sort[j], axis=0)
+        pca_j = (np.eye(fd) - pca_subspace @ pca_subspace.T) \
+                        @ (test_features - mean).T
+        score_pca_j = np.linalg.norm(pca_j, ord=2, axis=0)
+        scores_pca.append(score_pca_j)
+    test_predict_pca = np.argmin(scores_pca, axis=0)
+    acc_pca = compute_accuracy(classes[test_predict_pca], test_labels)
+    print('PCA: {}'.format(acc_pca))
+    return acc_pca
+
+def argmax(train_features, train_labels, test_features, test_labels):
+    train_pred = train_features.argmax(1)
+    train_acc = compute_accuracy(train_pred, train_labels)
+    test_pred = test_features.argmax(1)
+    test_acc = compute_accuracy(test_pred, test_labels)
+    return train_acc, test_acc
+
+def compute_accuracy(y_pred, y_true):
+    """Compute accuracy by counting correct classification. """
+    assert y_pred.shape == y_true.shape
+    if type(y_pred) == torch.Tensor:
+        n_wrong = torch.count_nonzero(y_pred - y_true).item()
+    elif type(y_pred) == np.ndarray:
+        n_wrong = np.count_nonzero(y_pred - y_true)
+    else:
+        raise TypeError("Not Tensor nor Array type.")
+    n_samples = len(y_pred)
+    return 1 - n_wrong / n_samples
+
+def baseline(train_features, train_labels, test_features, test_labels):
+    test_models = {'log_l2': SGDClassifier(loss='log', max_iter=10000, random_state=42),
+                   'SVM_linear': LinearSVC(max_iter=10000, random_state=42),
+                   'SVM_RBF': SVC(kernel='rbf', random_state=42),
+                   'DecisionTree': DecisionTreeClassifier(),
+                   'RandomForrest': RandomForestClassifier()}
+    for model_name in test_models:
+        test_model = test_models[model_name]
+        test_model.fit(train_features, train_labels)
+        score = test_model.score(test_features, test_labels)
+        print(f"{model_name}: {score}")
+
+def majority_vote(pred, true):
+    pred_majority = sps.mode(pred, axis=0)[0].squeeze()
+    return compute_accuracy(pred_majority, true)