1. Exploratory data analysis of crowdfunding data from Kickstarter

A central page to summarise some of my projects, with links to repositories, code and publications.

1. Exploratory data analysis of crowdfunding data from Kickstarter

GitHub repository here.
RPubs publication here.
Dataset here.

This Project applies various methods to a dataset to clean, transform, visualise and report on observations.

The chosen dataset is titled "Funding Successful Projects on Kickstarter" and can be found on Kaggle here, uploaded by user Lathwal

The dataset was released by Kickstarter, a crowdfunding company that connects community investors with start-up projects in an 'all-or-nothing' fashion: The user sets a goal for their project, and if it falls short by even $1, zero funding is attained.

The data was initially released from the perspective of the company, in that it had an interest of predicting the success of a project. There is also, however, information that potential creators may find useful.

2. Predicting Kickstarter campaign success

GitHub repository here.
RPubs publication here.

This project aims to address the initial business objective set by Kickstarter: to help predict whether a project will be successfully funded. Various classification and clustering methods will be used to achieve this.

3. Numeric analysis of CSV file

GitHub repository (with source code and sample csv file) can be found here.

A pythonic tool (importing no libraries) that processes a given CSV file, forms statistical information and outputs that information. User calls main(csvfile, year, type) such that they call the file, then the year of interest, and then whether they want info on general statistics (type = 'stats') or on correlations (type = 'corr').

4. Sentiment analysis of web pages

GitHub repository (with source code) can be found here.

A program to process WARC files and extract information from the HTML data within. The program analyses text to produce insights related to public sentiment, in several countries, towards their government. The user inputs 3 arguments: a WARC (web archive) file to be processed, a text file containing words defined as 'positive', and a text file containing words defined as 'negative'. The output produced is 4 lists: the first 3 being floats of statistics related to the counts of positive and negative words, and the last list being a list of the top-5 most occurring domain names in the file, along with their count.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
CNAME		CNAME
README.md		README.md
_config.yml		_config.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A central page to summarise some of my projects, with links to repositories, code and publications.

1. Exploratory data analysis of crowdfunding data from Kickstarter

2. Predicting Kickstarter campaign success

3. Numeric analysis of CSV file

4. Sentiment analysis of web pages

About

Releases

Packages

davidika/David_Ika

Folders and files

Latest commit

History

Repository files navigation

A central page to summarise some of my projects, with links to repositories, code and publications.

1. Exploratory data analysis of crowdfunding data from Kickstarter

2. Predicting Kickstarter campaign success

3. Numeric analysis of CSV file

4. Sentiment analysis of web pages

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages