Skip to content

Latest commit

 

History

History
218 lines (185 loc) · 8.28 KB

README.md

File metadata and controls

218 lines (185 loc) · 8.28 KB

gone

Github Actions Widget GoReport Widget GoDoc Widget

A simple neural network library in Go from scratch. 0 dependencies*

there are 0 neural network related dependencies, the only dependency is for persisting the weights to a file (golang/protobuf)

Example

Getting started

func main() {
 g := gone.New(
    0.1,
    gone.MSE(),
    gone.Layer{
      Nodes: 2,
    },
    gone.Layer{
      Nodes:     4,
      Activator: gone.Sigmoid(),
    },
    gone.Layer{
      Nodes: 1,
    },
 )

 g.Train(gone.SGD(), gone.DataSet{
    {
      Inputs:  []float64{1, 0},
      Targets: []float64{1},
    },
    {
      Inputs:  []float64{0, 1},
      Targets: []float64{1},
    },
    {
      Inputs:  []float64{1, 1},
      Targets: []float64{0},
    },
    {
      Inputs:  []float64{0, 0},
      Targets: []float64{0},
    },
 }, 5000)

 g.Predict([]float64{1, 1})
}

Saving model to disk

 g.Save("test.gone")

Loading model back into memory

 g, err := gone.Load("test.gone")

TODO

gone/

  • Types of task:
    • Classification - softmax (soon to be implemented) as the last layer's activation function
    • Regression - sigmoid as the last layer's activation function
  • Bias
    • Matrix, rather than a single number
  • Feedforward (Predict)
  • Train
    • Support shuffling the data
    • Epochs
    • Backpropagation
    • Batching
    • Different loss functions
      • Mean Squared Error
      • Cross Entropy Error
  • Saving data - Done thanks to protobuf
  • Loading data
  • Adam optimizer
  • Nestrov + Momentum for GD
  • Fix MSE computation in debug mode (not used in actual backpropagation)
  • Somehow persist configurations for Activation, Loss and Optimizer functions in the protobuf messages (???, if we want to do it like it tensorflow, we'd have to do interface{} and do type assertions)
  • Convolutional Layers
    • Flatten layer
  • Copy
  • Crossover
  • Mutate
    • Gaussian Mutator

matrix/

NOTE: all of this was migrated to github.com/fr3fou/matrigo

  • Randomize
  • Transpose
  • Scale
  • AddMatrix
  • Add
  • SubtractMatrix
  • Subtract
  • Multiply
  • Multiply
  • Flatten
  • Unflatten
  • NewFromArray - makes a single row
  • Map
  • Fold
  • Methods to support chaining
n.Weights[i].
    Multiply(output).                         // weighted sum of the previous layer
    Add(n.Layers[i+1].Bias).                  // bias
    Map(func(val float64, x, y int) float64 { // activation
      return n.Layers[i+1].Activator.F(val)
    })

Research

  • Derivatives ~
  • Partial Derivatives ~
  • Linear vs non-linear problems (activation function)
  • Gradient Descent
    • (Batch) Gradient Descent (GD)
    • Stochastic Gradient Descent (SGD)
    • Mini-Batch Gradient Descent (MBGD?)
  • Softmax (needed for multi class classification!)
  • Mean Squared Error
  • Cross Entropy Error (needed for multi class classification!)
  • How to determine how many layers and nodes to use
  • One Hot Encoding
  • Convolutional Layers
  • Reinforcment learning
  • Genetic Algorithms~
  • Neuroevolution~
  • Simulated Annealing
  • Q-Learning
  • Linear vs Logistic Regression
  • 3D inputs (regarding Video and CNNs)

Questions

These are some (stupid) questions I have that confuse me:

  • Is Neuroevolution considered Reinforcement learning?
  • How is training done with HUGE datasets when they can't fit on your storage device?
    • Imagine your dataset is a copule of TB big, what do you do?
  • Is Q-Learning only done with a single agent (unlike genetic algorithms / neuroevolution)?
    • Is Q-Learning the only method for Reinforcement Learning?
  • What's the difference between a Convolutional Neuron and a normal weight matrix?
  • Is Deep Learning really just a Neural Network with a lot of layers? (more than 2)
  • Why do you need multiple CNN layers? Is it to go to a smaller and smaller version of the image? (when working with images that is) (because of MaxPooling?) Why can't you go directly to the smallest size (512x512 -> 16x16 vs 512x512 -> 256x256 -> 128x128 -> ...)?
  • So if images are stored in a 2D array (but with the RGB channels, making it a 3D array with 3 layers), do we use Conv2D or Conv3D?
  • If 3D inputs are used for videos, how is that represented? Is a single input basically an array of 2D arrays (array of images - frames)? So basically a single observation is a single video and your entire dataset is a lot of videos, right?

Examples

Shoutouts

  • David Josephs - was of HUGE help with algebra and other ML-related questions; also helped me spot some nasty bugs!

References

Note: some of the references weren't used during the development, but are in this section as they were a helpful guidance throughout my AI journey