Skip to content

Micallef Homework 2 #14

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 52 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
52888bb
Update cpu.cu
micallef25 Sep 10, 2019
6ebaeb5
Update naive.cu
micallef25 Sep 10, 2019
9e3d3ff
Update cpu.cu
micallef25 Sep 10, 2019
88719b3
Update naive.cu
micallef25 Sep 10, 2019
d81cd1a
Update naive.cu
micallef25 Sep 10, 2019
d1096f0
Update naive.cu
micallef25 Sep 10, 2019
d2bb850
Update common.cu
micallef25 Sep 10, 2019
f3d87bc
naive gpu implementation working ... n*bytes ... not just n ....
micallef25 Sep 11, 2019
6d40bd9
thrust implementation working. work efficient working with rounded bu…
micallef25 Sep 11, 2019
62a2223
work efficient working
micallef25 Sep 12, 2019
a05856b
moved scan implementation to make scatter cleaner. also fixed roundin…
micallef25 Sep 12, 2019
d976def
work efficient compact working for pow2
micallef25 Sep 12, 2019
67026c2
all tests passing timwe to tidy up
micallef25 Sep 12, 2019
addb0da
check in before flight, collected data for implementations and adjust…
micallef25 Sep 12, 2019
0f36f58
replicated functionality of the excel sheet for XOR data set on gpu a…
micallef25 Sep 16, 2019
57d29c7
Update README.md
micallef25 Sep 17, 2019
6c10be3
Update README.md
micallef25 Sep 17, 2019
7c2bf74
added graph pics
micallef25 Sep 17, 2019
53252dd
Merge branch 'master' of https://github.com/micallef25/Project2-Numbe…
micallef25 Sep 17, 2019
65c4bae
Update README.md
micallef25 Sep 17, 2019
5ae68c0
Update README.md
micallef25 Sep 17, 2019
b9cadc0
Update README.md
micallef25 Sep 17, 2019
849318e
Update README.md
micallef25 Sep 17, 2019
1f0e1ae
Update README.md
micallef25 Sep 17, 2019
61aaa8e
added test result pics
micallef25 Sep 17, 2019
c3d8b5e
Merge branch 'master' of https://github.com/micallef25/Project2-Numbe…
micallef25 Sep 17, 2019
fb5dd95
Update README.md
micallef25 Sep 17, 2019
39788db
Update README.md
micallef25 Sep 17, 2019
241fac7
Update README.md
micallef25 Sep 17, 2019
b370834
Update README.md
micallef25 Sep 17, 2019
f532e40
Update README.md
micallef25 Sep 17, 2019
e0f82bf
Update README.md
micallef25 Sep 17, 2019
1c3116f
from last nights struggles
micallef25 Sep 17, 2019
e30d2da
Merge branch 'master' of https://github.com/micallef25/Project2-Numbe…
micallef25 Sep 17, 2019
5418e3d
stream compaction code + readme set
micallef25 Sep 17, 2019
7757fe5
added framework code
micallef25 Sep 17, 2019
5890229
updating readme
micallef25 Sep 18, 2019
08f5f66
updating readme
micallef25 Sep 18, 2019
7b1d2a5
updating readme
micallef25 Sep 18, 2019
2b7d5ab
updating readme
micallef25 Sep 18, 2019
8bcf907
updating readme
micallef25 Sep 18, 2019
493e5a9
updating readme
micallef25 Sep 18, 2019
94458ac
updating readme
micallef25 Sep 18, 2019
6f6f0a6
updating readme
micallef25 Sep 18, 2019
0e80fa4
updating readme
micallef25 Sep 18, 2019
41b9808
updating readme
micallef25 Sep 18, 2019
533d4ab
updating readme
micallef25 Sep 18, 2019
a8acfe9
updating readme
micallef25 Sep 18, 2019
e529d2a
updating readme
micallef25 Sep 18, 2019
34e80dc
updating readme
micallef25 Sep 18, 2019
d75b218
updating readme
micallef25 Sep 18, 2019
d327f7e
updating readme
micallef25 Sep 18, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 31 additions & 6 deletions Project2-Character-Recognition/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,37 @@ CUDA Character Recognition

**University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 2**

* (TODO) YOUR NAME HERE
* (TODO) [LinkedIn](), [personal website](), [twitter](), etc.
* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
* Eric Micallef
* https://www.linkedin.com/in/eric-micallef-99291714b/

* Tested on: Windows 10, i5, Nvidia GTX 1660 (Personal)

### (TODO: Your README)
### Analysis

Include analysis, etc. (Remember, this is public, so don't put
anything here that you don't want to share with the world.)
Unfortunately, I will be taking a zero for this part of the project. I implemented some stuff on the GPU and CPU but not hardly enough to warrant any respect.

Thursday morning I flew to Michigan to be in a wedding and came back Monday morning. Needless to say I underestimated time.

Instead of grading my project please enjoy these nice photos of me from the wedding and a yak I found near my house in MI.



* Me trying to make my neural network in one day ... (yes that is me ... don't lock your keys in your trunk when you are in the middle of no where Michigan )

![](img/sawzall.gif)

* Them feels when you think you know what back propagation is but then you don't ....

![](img/n1.jpg)

* Me pointing at my working neural network like ...

![](img/nn.jpg)

* A random yak in Michigan?

![](img/yak.jpg#center)

* finally its over ...

![](img/ss.jpg)
Original file line number Diff line number Diff line change
Expand Up @@ -7,5 +7,5 @@ set(SOURCE_FILES

cuda_add_library(character_recognition
${SOURCE_FILES}
OPTIONS -arch=sm_20
OPTIONS -arch=sm_70
)
13 changes: 13 additions & 0 deletions Project2-Character-Recognition/character_recognition/common.cu
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,16 @@ void checkCUDAErrorFn(const char *msg, const char *file, int line) {
fprintf(stderr, ": %s: %s\n", msg, cudaGetErrorString(err));
exit(EXIT_FAILURE);
}

//void memory_debug_float(int elements, float* cuda_mem, float* cpu_mem)
//{
// printf("elements %d\n ", elements);
// cudaMemcpy(cpu_mem, cuda_mem, elements * sizeof(float), cudaMemcpyDeviceToHost);
// checkCUDAErrorFn("debug failed!");
// printf("=============================\n");
// for (int i = 0; i < elements; i++)
// {
// printf("out[%d] %d \n", i, cpu_mem[i]);
// }
// printf("=============================\n");
//}
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
* Check for CUDA errors; print and exit if there was a problem.
*/
void checkCUDAErrorFn(const char *msg, const char *file = NULL, int line = -1);
//void memory_debug_float(int elements, float* cuda_mem, float* cpu_mem);

inline int ilog2(int x) {
int lg = 0;
Expand Down
252 changes: 249 additions & 3 deletions Project2-Character-Recognition/character_recognition/mlp.cu
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,13 @@
#include <cuda_runtime.h>
#include "common.h"
#include "mlp.h"
#include <math.h>

#define NUM_LAYERS 2
#define INPUT_NODES 2
#define NUM_WEIGHTS 2

#define blockSize 128

namespace CharacterRecognition {
using Common::PerformanceTimer;
Expand All @@ -10,18 +17,257 @@ namespace CharacterRecognition {
static PerformanceTimer timer;
return timer;
}

void memory_debug_float(int elements, float* cuda_mem, float* cpu_mem)
{
printf("elements %d\n ", elements);
cudaMemcpy(cpu_mem, cuda_mem, elements * sizeof(float), cudaMemcpyDeviceToHost);
checkCUDAErrorFn("debug failed!");
printf("=============================\n");
for (int i = 0; i < elements; i++)
{
printf("out[%d] %f \n", i, cpu_mem[i]);
}
printf("=============================\n");
}


// TODO: __global__

__global__ void kernel_feed_forward(int n, float* dev_in, float* weights)
{
int tid = (blockIdx.x * blockDim.x) + threadIdx.x;

if (tid >= n)
{
return;
}

float data = dev_in[tid];

dev_in[tid] = (data * weights[tid]) + (data * weights[tid+n]);
}

__global__ void kernel_activate(int n, float* dev_in)
{
int tid = (blockIdx.x * blockDim.x) + threadIdx.x;

if (tid >= n)
{
return;
}

float var = (dev_in[tid] * -1);
float e = expf(var);
dev_in[tid] = 1 / (1 + e);
}

float transfer_function(float in)
{
float var = (in * -1);
float e = exp(var);
return( 1 / (1 + e) );
}

//float transfer_derivative(float var)
//{
// float var = (var * -1);
// float e = exp(var);
// return(1 - (1 / (1 + e)));
//}

void feed_forward(float* in, float* out, float* weights, int length)
{
for (int i = 0; i < length; i++)
{
float temp = in[i];
out[i] = 0;

for (int j = 0; j < INPUT_NODES; j++) // or layers?
{
float weight = *((weights+i* INPUT_NODES) + j);
out[i] += (temp * weight);
printf("in[%d] = %f * %f\n", i, temp, weight);
}

printf("feed[%d] = %f\n", i, out[i]);
out[i] = transfer_function(out[i]);
printf("activate: %f\n", out[i]);
}
}

//void activate_function(float* in,float* out, int length)
//{
// for (int i = 0; i < length; i++)
// {
// float var = (in[i] * -1);
// float e = exp(var);
// out[i] = 1 / (1 + e);
// printf("activate: %f\n", in[i]);
// }
//}

//void update_weights(float* delta_weights, float* weights, float* gradient, float* data, int length)
//{
// for (int i = 0; i < 6; i++)
// {

// }
//}

//float calculate_hidden_gradient( float* weight, float* gradient )
//{
// weight[]
// return (delta * transfer_derivative(out));
//}

//float calculate_gradient( float out, float target)
//{
//
// return (out*target); // I think 1 / e + x
//}

//void back_propagate(float* data, int length,float* weights, float expected_value, float* gradient, float* delta_weight)
//{
// float error = 0;
// // loop through and compute the rms not including the bias node
// for (int i = 0; i < length; i++)
// {
// // get the delta between what we predicted vs expected
// float delta = data[i] - expected_value;

// error = delta * delta;
// }
// error /= length;
// error = std::sqrt(error);

// // calcuate gradient on input layer?
// // only have one output layer node thing
// for (int n = 4; n < 6; n++)
// {
// // calculate gradient of the layer
// gradient[n] = calculate_gradient( weights[n], error );
// }

// //calculate gradient on hidden layer?

// for (int n = 0; n < 4; n+=2)
// {
// // calculate hidden layer
// calculate_hidden_gradient(weights[n], gradient[6-1-n], data[n]);
// }

// // update the weights
// update_weights();

//}

/**
* Example of use case (follow how you did it in stream compaction)
*/
/*void scan(int n, int *odata, const int *idata) {
void train(int n, int *data, int expected) {
timer().startGpuTimer();
// TODO

float weights[6];
float in_data[4] = { 0,0 };
float out_data[4];
float temp[4];
weights[0] = 10.1;
weights[1] = .9;
weights[2] = 20;
weights[3] = .87;
weights[4] = 41;
weights[5] = -54;
int wt_idx = 0;

float* dev_buff;
float* dev_wts;

float* host_buff = new float[4];

int rounded_depth = ilog2ceil(NUM_LAYERS);
int rounded_elements = 1 << rounded_depth;
printf("rounded elements %d\n ", rounded_elements);
dim3 fullBlocksPerGrid((rounded_elements + blockSize - 1) / blockSize);



cudaMalloc((void**)&dev_buff, 2 * sizeof(float));
checkCUDAErrorFn("malloc dev_boolbuff in failed!");
cudaMemcpy(dev_buff, in_data, 2 * sizeof(float), cudaMemcpyHostToDevice);
checkCUDAErrorFn("dev_in copy failed!");

/*cudaMalloc((void**)&dev_out, n * sizeof(int));
checkCUDAErrorFn("malloc dev_out in failed!");
cudaMalloc((void**)&dev_in, rounded_elements * sizeof(int));
checkCUDAErrorFn("malloc dev_in in failed!");*/


// is there a way to place this in memory at compile time?
cudaMalloc((void**)&dev_wts, 6 * sizeof(float));
checkCUDAErrorFn("malloc dev_in in failed!");
cudaMemcpy(dev_wts, weights, 6 * sizeof(float), cudaMemcpyHostToDevice);
checkCUDAErrorFn("dev_weights copy failed!");

// input nodes
// each wight has an impact on the other nodes
for (int i = NUM_LAYERS; i > 0; i--)
{
//feed_forward(&in_data[0], &temp[0], (float*)&weights[wt_idx], i, INPUT_NODES);
//activate_function(&temp[0], &out_data[0], i );
kernel_feed_forward << < fullBlocksPerGrid, blockSize >> > (i, dev_buff, &dev_wts[wt_idx]);
checkCUDAErrorFn("feed forward failed!");
memory_debug_float(i, &dev_buff[0], &host_buff[0]);
kernel_activate << < fullBlocksPerGrid, blockSize >> > (i, dev_buff);
checkCUDAErrorFn("activate failed!");
memory_debug_float(i, dev_buff, host_buff);
//feed_forward(&out_data[0], &temp[0], (float*)&weights[wt_idx][0], 1,2);
//activate_function(&temp[0], &out_data[0], 1);
//std::swap(in_data, out_data);
wt_idx += 4; // length of array? NUM_NODES* INPUT NODES?
}

//error = out_data[0]
timer().endGpuTimer();
}
*/

void train_cpu(int n, float *data, float expected)
{

float weights_layer1[4];
float weights_layer2[2];

weights_layer1[0] = 10.1;
weights_layer1[1] = .9;
weights_layer1[2] = 20;
weights_layer1[3] = .87;
weights_layer2[4] = 41;
weights_layer2[5] = -54;
int wt_idx = 0;

float* hidden_layer = new float[n];
float* out_put_layer = new float[n];
float* temp[NUM_LAYERS + 1]; // pointer to arrays
temp[0] = data;
temp[1] = hidden_layer;
temp[2] = out_put_layer;

float* temp_weights[NUM_LAYERS]; // pointer to arrays
temp[0] = weights_layer1;
temp[1] = weights_layer2;


for (int i = 0; i < NUM_LAYERS; i++)
{
feed_forward(temp[i],temp[i+1], temp_weights[i], NUM_LAYERS-i);
}


delete hidden_layer;
delete out_put_layer;
//back_propagate(in_data,NUM_LAYERS+1,&weights[0],exp,gradient


}
// TODO: implement required elements for MLP sections 1 and 2 here
}
2 changes: 2 additions & 0 deletions Project2-Character-Recognition/character_recognition/mlp.h
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,6 @@ namespace CharacterRecognition {
Common::PerformanceTimer& timer();

// TODO: implement required elements for MLP sections 1 and 2 here
void train(int n, int *odata, const int *idata);
void train_cpu(int n, float *data, const float expected);
}
Binary file added Project2-Character-Recognition/img/n1.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Project2-Character-Recognition/img/nn.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Project2-Character-Recognition/img/sawzall.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Project2-Character-Recognition/img/ss.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Project2-Character-Recognition/img/yak.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading