Skip to content

Autorship attribution using comments from brazillian subreddits

License

Notifications You must be signed in to change notification settings

matiasvinicius/reddit-in-portuguese

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

85 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Binary autorship attribution on Reddit with comments in Portuguese

[Research Paper]

Internet interaction environments such as social networks transfer large-scale textual data that implicitly carry the writing styles of each network user. Given the constant and intense flow of information through information systems of this type, it is necessary to develop techniques that can distinguish a text between two candidate authors for reasons of, for example, avoiding the return of users banned from the platform. This repository addressed and evaluated different ways of performing authorship attribution through natural language processing and machine learning, based on comments in Portuguese extracted from the social network Reddit.

Authorship attribution is the task of recognizing the author who wrote a text through his writing style, and to perform attribution in an automated way, natural language processing and machine learning techniques are used. The application of such methods is varied, and so is the way of approaching problems.

AUC

This repository contains the codes for the problem of binary authorship attribution, being part of the final paper that I (Vinicius Alves Matias) carried out to obtain a bachelor's degree in Information Systems, supervised by Dr. Luciano Antonio Digiampietri.

About

Autorship attribution using comments from brazillian subreddits

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published