Skip to content

Project 2: Saket Karve #8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 29 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
5c47625
CPU done.
Sep 12, 2019
70af28d
All stream compaction done except wor-efficient compact
Sep 13, 2019
51f7083
Completed Stream Compaction code
Sep 14, 2019
486cec1
Forward pass without softmax
Sep 14, 2019
c751afd
MLP code complete
Sep 15, 2019
b3649b5
MLP complete. Tested with XOR.
Sep 15, 2019
fa086be
Working code for MLP. Stream Compaction main to be changed
Sep 16, 2019
234f864
All working code
Sep 17, 2019
891e84a
Update README.md
karvesaket Sep 17, 2019
9de3dde
Merge pull request #1 from karvesaket/patch-1
karvesaket Sep 17, 2019
5c32ac9
Update README.md
karvesaket Sep 17, 2019
15c2cc2
Update README.md
karvesaket Sep 17, 2019
76facb3
Add files via upload
karvesaket Sep 17, 2019
2374a1b
Update README.md
karvesaket Sep 17, 2019
8851699
Add files via upload
karvesaket Sep 17, 2019
792f129
Add files via upload
karvesaket Sep 17, 2019
5d098a1
Update README.md
karvesaket Sep 17, 2019
3e54a13
Add files via upload
karvesaket Sep 17, 2019
50d5cca
Update README.md
karvesaket Sep 17, 2019
4ca910b
Updated main
Sep 17, 2019
fa0f593
Merge branch 'master' of https://github.com/karvesaket/Project2-Numbe…
Sep 17, 2019
1089890
Pushed build dir
Sep 17, 2019
572886e
Update README.md
karvesaket Sep 17, 2019
3b2233a
Update README.md
karvesaket Sep 18, 2019
c9efc8f
Update README.md
karvesaket Sep 18, 2019
6257c6b
Radix sort
Sep 18, 2019
6c54c4d
Merge branch 'master' of https://github.com/karvesaket/Project2-Numbe…
Sep 18, 2019
4c712dd
Update README.md
karvesaket Sep 18, 2019
b0e0bd7
Shared Memort
Sep 28, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .vs/ProjectSettings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"CurrentProjectSetting": "x64-Debug (default)"
}
10 changes: 10 additions & 0 deletions .vs/VSWorkspaceState.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
{
"ExpandedNodes": [
"",
"\\Project2-Stream-Compaction",
"\\Project2-Stream-Compaction\\src",
"\\Project2-Stream-Compaction\\stream_compaction"
],
"SelectedNode": "\\Project2-Stream-Compaction\\src\\main.cpp",
"PreviewInSolutionExplorer": false
}
11 changes: 11 additions & 0 deletions .vs/launch.vs.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
{
"version": "0.2.1",
"defaults": {},
"configurations": [
{
"type": "default",
"project": "Project2-Stream-Compaction\\CMakeLists.txt",
"name": "CMakeLists.txt"
}
]
}
Binary file added .vs/slnx.sqlite
Binary file not shown.
1 change: 0 additions & 1 deletion Project2-Character-Recognition/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@ cis565_getting_started_generated_kernel*
*.sln
*.vcxproj
*.xcodeproj
build

# Created by https://www.gitignore.io/api/linux,osx,sublimetext,windows,jetbrains,vim,emacs,cmake,c++,cuda,visualstudio,webstorm,eclipse,xcode

Expand Down
3 changes: 3 additions & 0 deletions Project2-Character-Recognition/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ if(${CMAKE_SYSTEM_NAME} MATCHES "Darwin")
endif()

include_directories(.)
link_directories(${CUDA_TOOLKIT_ROOT_DIR}/lib/x64)
add_subdirectory(character_recognition)

cuda_add_executable(${CMAKE_PROJECT_NAME}
Expand All @@ -31,5 +32,7 @@ cuda_add_executable(${CMAKE_PROJECT_NAME}

target_link_libraries(${CMAKE_PROJECT_NAME}
character_recognition
cublas
curand
${CORELIBS}
)
107 changes: 100 additions & 7 deletions Project2-Character-Recognition/README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,107 @@
CUDA Character Recognition
======================

**University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 2**
University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 2**

* (TODO) YOUR NAME HERE
* (TODO) [LinkedIn](), [personal website](), [twitter](), etc.
* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
* Saket Karve
* [LinkedIn](https://www.linkedin.com/in/saket-karve-43930511b/), [twitter](), etc.
* Tested on: Windows 10 Education, Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz 16GB, NVIDIA Quadro P1000 @ 4GB (Moore 100B Lab)

### (TODO: Your README)
### Description

This repository contains the CUDA implementation of a Multi Layer Perceptron (MLP). The network has a architecture having one hidden layer and can be used to train and learn mappings between any inputs and labels. The inputs to the network should be one dimensional. The network can then be trained on a data set for some classification task. Once trained, the network can be forward propogated to make predictions for any input (even those not in the dataset used for training).

The input size, hidden layer size (number of perceptron units in a layer) and the output size (the number of classes to classify into) can be easily configured for the network. The network uses non-linear activation functions in the hidden units and the output layer to convert the final outputs into probabilities (or say confidence) of classifying in a given class.

### The Neural Network Architecture

The complete neural network architecture implemented can be picturized as follows,

![](img/mlp.png)
[Image reference](https://www.google.com/url?sa=i&source=images&cd=&ved=2ahUKEwjFj--K3djkAhWsq1kKHWfaCkAQjRx6BAgBEAQ&url=https%3A%2F%2Fwww.cc.gatech.edu%2F~san37%2Fpost%2Fdlhc-fnn%2F&psig=AOvVaw3n6z_jJ1Gt-TEhuC_wXEFM&ust=1568839521037348)

The network implemented has a single hidden layer. The size of the hidden layer can be easily configured during initializing the network. The input layer will have size equal to the dimension of the each input. The output layer will similarly have a size equal to the number of classes to be used for classification.

We need to use a non-linear function after passing through each layer of the network till the last layer. This is to make sure the neural network does not learn a linear function eventually. So, I have used the sigmoid activation function after the hidden layer. The Sigmoid function takes the form as shown in the below figure.

![](img/sigmoid.jpg)
[Image reference](https://www.google.com/url?sa=i&source=images&cd=&ved=2ahUKEwjgrNPqztjkAhVow1kKHeCzBIIQjRx6BAgBEAQ&url=https%3A%2F%2Ftwitter.com%2Fhashtag%2Fsigmoid&psig=AOvVaw3DJLoIr81ZD90Mq1ZBwYQj&ust=1568835764647240)

At the end of the last layer, to transform the outputs to look like probabilities, we use the softmax function. This makes sure the outputs for all classes sum to 1.

The symbols used subsequently will mean the following,
- N : Number of instances in the data set
- H : Size of the hidden layer
- C : Number of classes
- F : Dimension of each input instance
- W1 : Weight matrix between input and hidden layer
- W2 : Weight matrix between hidden and output layer

The dimensions of each layers and weight matrices used are as follows,
- Input Layer : N x F
- Hidden Layer : N x H
- Output Layer : N x C
- W1 : F x H
- W2 : H x C

### Training Process

Before training the network, the network parameters explained in the above section need to be initialized. This is done using the ```initialize_network()``` method.

Once the network is initialized, the ```train()``` function can be called which will perform training for as many epochs (iterations) as specified by the user.

**BACKPROPOGATION

The training performs these operations in the following order and is looped over every iteration.
- Forward pass through the network - this will generate the values of the output layer (probabilities for each class)
- Compute the loss for the pass
- Compute the gradients of the loss with respect to each weight
- Update all the weights

### Making predictions

Forward passing through the network for any input image will output the predicted class. It will also give the confidence the neural network classifies the input image in the predicted class.

### Analysis of the training process

I tested the implementation first on the XOR function. The network converged in arounnd 1500 epochs. After training for these many iterations, the netowrk could give accurate predictions for the XOR function almost 100% of the time.

The architecture of the neural network used to train can be found in the following figure,

![](img/xor_network.svg)

Number of epcohs for convergence = 1500
Final Loss =
Test accuracy = 100%

The training plot of the network tested on the XOR function can be seen in the following figure.

![](img/training_curve_xor.PNG)

After testing for the XOR function, I loaded the character recognition dataset and trainied the network on it.

Based on the architecture created, the network converged in around 30 epochs. I trained it for 50 epochs with 200 hidden units.

Number of epcohs for convergence = 30
Final Loss =
Test accuracy = 100%

The training plot of the network tested on the Character Recognition can be seen in the following figure.

![](img/training_curve_character.PNG)

I also tested the networks performance for different network architectures (number of perceptron in the hidden layer). The following figure shows the comparison.

![](img/training_curve_layers.PNG)

The trained weights can be found [here](img/trained_weights_200_layers.xlsx)

### Testing the network

I tested the network on all the images in the data-set in a reverse order to make sure the network has not learned the order and has actually learned the mapping.

The predictions for all tests are as shown in the following figure. The figure shows the true labels, the predicted class and the probabilities associated with the class.

![](img/MLP_test_predictions.PNG)

Include analysis, etc. (Remember, this is public, so don't put
anything here that you don't want to share with the world.)

Original file line number Diff line number Diff line change
Expand Up @@ -7,5 +7,5 @@ set(SOURCE_FILES

cuda_add_library(character_recognition
${SOURCE_FILES}
OPTIONS -arch=sm_20
OPTIONS -arch=sm_61
)
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,4 @@ void checkCUDAErrorFn(const char *msg, const char *file, int line) {
fprintf(stderr, ": %s: %s\n", msg, cudaGetErrorString(err));
exit(EXIT_FAILURE);
}

Loading