DATASET DESCRIPTION:

all_projects.csv consists of list of projects corresponding to the different programming language. We have selected Python specific projects only.
manual_annotated_pr.json consists of all manually labeled pull requests along with the corresponding interaction types between generator and commenters. This file has an array of json object with the following format: {project name: {PR ID: {commenter1: score}}}
result_trust_values.csv consists of resulting trust values (belief, disbelief, uncertainty) with corresponding PR status (accepted, rejected).
PR Requests.zip consists of the response of all pull requests for 179 projects from the GitHub API. Each response is stored as the following: PR number:{response} For the response format refer to the link: https://developer.github.com/v3/pulls/
preprocomm.json.zip consists of all the preprocessed comments. The file has an array of json object with the following format: {projectname: {pr_id: [generator, creation date, {commenter: comment}]}}
manual_annotation.csv consists of a manually annotated score for the generator from commenter's perspective. Corresponding preprocessed commenter's comment can be fetched as preprocomm[projectname][pr_id][2][commenter] from the preprocomm.json file.
accuracy_trust_metrics.csv consists of the accuracy metrics (precision, recall, f1-score, tp, tn, fp, fn) obtained using Trust model for 30 repetitions.
accuracy_pr_hist.csv consists of the accuracy metrics (precision, recall, f1-score, tp, tn, fp, fn) obtained using History model for 30 repetitions.
accuracy_trust_pr_hist.csv consists of the accuracy metrics (precision, recall, f1-score, tp, tn, fp, fn) obtained using Hybrid model for 30 repetitions.
MAE_classifier.json consists of MAE score for the regression techniques described in Table 4. It has following format: {Classifier_name: [MAE values for 30 repetitions]}
time_performance_classifier.json: consists of a MAE score for the time based classifier
time_performance_regression.json: consists of a MAE score for time based regression models
repo_score.json: consists of a MAE score for each repository

CODE DESCRIPTION:

It has following python scripts:

train.py: Trains the model using a manually labeled dataset and stores the corresponding trained model on Dateset/Generated/finalized_model.sav.
predict.py: Predicts the interaction type for rest of the unlabeled comments. Unlabeled data should be under the directory Dataset/Generated with the name preprocomm.json. Each json object has the following format {projectname: {pr_id: [generator, creation date, {commenter: comment}]}}
preprocess_record.py Maps interaction types from 1 to 5 -> -1 to 1.
construct_graph.py Constructs CDN and assigns edge with a corresponding trust values
trustpropagation.py Propagates trust between any pair of developers
PR_evaluation.py Computes trust value for the pull requests that are in a test set
classifier.py Computes and stores the accuracy metrics (precision, recall, f1-score, tp, tn, fp, fn) for Decision Tree classifier for: (1) History Model, (2) Trust Model, (3) Hybrid Model.

RESULT REPLICATION DESCRIPTION:

To generate Table 4, use train.py with different regression techniques. For this use preprocomm.json located in Dataset/Generated directory.
To generate Table 5, use the network constructed using construct_graph.py and use trustpropagation.py.
To generate Figure 4, 5, and 6, use result_trust_values.txt located in Dataset directory.
To generate generate the data for Table 6, use Pull-Requests.zip (divided into Pull-Requests-part1.zip, Pull-requests-part2.zip, and Pull-Requests-part3.zip) and result_trust_values.txt.

For any questions, please email [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
Dataset		Dataset
.DS_Store		.DS_Store
LICENSE		LICENSE
PR_evaluation.py		PR_evaluation.py
README.md		README.md
classifier.py		classifier.py
constructnetwork.py		constructnetwork.py
predict.py		predict.py
preprocess.py		preprocess.py
preprocess_record.py		preprocess_record.py
regressor.py		regressor.py
train.py		train.py
trustpropagation.py		trustpropagation.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DATASET DESCRIPTION:

CODE DESCRIPTION:

RESULT REPLICATION DESCRIPTION:

About

Releases 5

Packages

Languages

License

hiteshsapkota/OSS-Trust

Folders and files

Latest commit

History

Repository files navigation

DATASET DESCRIPTION:

CODE DESCRIPTION:

RESULT REPLICATION DESCRIPTION:

About

Resources

License

Stars

Watchers

Forks

Releases 5

Packages 0

Languages

Packages