Skip to content

Latest commit

 

History

History
68 lines (43 loc) · 3.76 KB

README.md

File metadata and controls

68 lines (43 loc) · 3.76 KB

In this repository I'm goinng to:

  • Implement the original architecture of Basic Block & Bottleneck Block with both Identity and Projection short-cut connections.
  • Use my implementation to build a simple ResNet 12 and train the model usign CIFAR-10 dataset

ResNet 12 architecture


Residual Blocks


We can define Neural networks as Universal function approximators F(X) = X and the accuracy increases with increasing the number of layers.
But increasing the number of layers return some problems like vanishing and exploding gradients and the curse of dimensionality, also the accuracy will saturate at one point and eventually degrade.

If we have sufficiently deep networks, it may not be able to learn even a simple functions like an identity function.

The idea behind the above block is, instead of hoping each few stacked layers directly fit a desired underlying mapping say (H(x)), we explicitly let these layers fit a residual mapping i.e.. (F(x) = H(x) - x). Thus original mapping (H(x)) becomes (F(x) + x).

Shortcut connections
These connections are those skipping one or more layers. F(x)+x can be understood as feedforward neural networks with “shortcut connections”.

Why deep residual framework?
The idea is motivated by the degradation problem (training error increases as depth increases). Suppose if the added layers can be constructed as identity mappings, a deeper model should have training error no greater than its shallower counterpart.

If identity mappings are optimal, it is easier to make F(x) as 0 than fitting H(x) as x (as suggested by degradation problem).

Paper
Paper Summary

Residual Block

I'm goinng to implment Basic and Bottleneck Residual blocks only:


  • Basic Block: Includes 2 operations:
    • 3x3 convolution with padding followed by BatchNorm and ReLU.
    • 3x3 convolution with padding followed by BatchNorm.
  • Bottleneck Block: include 3 operations:
    • 1x1 convolution followed by BatchNorm and ReLU.
    • 3x3 convolution with stride followed by BatchNorm and ReLU.
    • 1x1 convolution followed by BatchNorm.

    The whole block is called one block (layer), which is consists of multiple layers (Conv, BN, ReLU).
    Residual blocks building blocks of Resnet

    Types of Residual Block

    Types of shortcut connections residual neural network

    The shortcut connections of a residual neural network can be:

    • An identity block, which is employed when the input and output have the same dimensions.
    • A Projection block, which is a convolution block, used when the dimensions are different, it offers a channel-wise pooling, often called feature map pooling or a projection layer.

    Deep Residual Learning for Nonlinear Regression