Skip to content

U-Alberta/wikipedia_controversy_dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 

Repository files navigation

wikipedia_controversy_dataset

This repository has the dataset used in the study of controversy in Wikipedia reported in the following papers

In total there are 3 folders, 61 files, totalling 152 GiB uncompressed. The archive was created using the 7zip tool on linux, and it is a multipart file because of the file size restrictions in git.

To access the files, see https://github.com/U-Alberta/wikipedia_controversy_dataset/releases/tag/v1.

Please acknowledge the source of the dataset by citing either paper below.

@article{DBLP:journals/tist/RadB15,
  author    = {Hoda Sepehri Rad and
               Denilson Barbosa},
  title     = {Identifying Controversial Wikipedia Articles Using Editor Collaboration
               Networks},
  journal   = {{ACM} Trans. Intell. Syst. Technol.},
  volume    = {6},
  number    = {1},
  pages     = {5:1--5:24},
  year      = {2015}
}
@inproceedings{DBLP:conf/ht/RadMRB12,
  author    = {Hoda Sepehri Rad and
               Aibek Makazhanov and
               Davood Rafiei and
               Denilson Barbosa},
  title     = {Leveraging editor collaboration patterns in wikipedia},
  booktitle = {{HT}},
  pages     = {13--22},
  publisher = {{ACM}},
  year      = {2012}
}