[Abstract] This repository contains instructional materials designed to aid educators in teaching and training psychology and social science research methods students about the phenomenon of online hate speech and the role of computer-aided text analysis in studying it.
--------------------------
Many choices exist for the appropriate unit of analysis in psychological studies of hate speech (e.g., individuals who post, individuals who are targeted, etc.) but in its most essential form, online hate speech exists in the form of databases of text. Psychology students usually take course work on topics like experimental design and inferential statistics, but training on computer-aided analysis of text is less common. Thus, for students to understand the variable nature of online hate speech, instructional support for basic forms of text analysis will enable psychology faculty members to include real data and analytical examples in their course offerings. Thus, this repository contains the following materials:
A brief literature review of psychological research on hate speech including text analysis methods, benfits and ethics. This reading is suitable as an orientation for instructors and/or as an introductory reading for students.
An annotated bibliography of open-source scholarship on hate speech and text analysis. Selected readings vary and include: theoretical readings on the psychological effects and motivations around hate speech, case studies of text analysis methods and software, and [TBD]. These may be useful to instructors for selecting assigned readings for before or after in-class instruction. Also included is a selection of URLS to hate speech lexicons with a brief description of their scope. These can be useful for research, examples, or in-class activities.
22 slides, including bibliography. This slide deck is intended to serve as a brief introduction to the phenomenon of online hate speech with special attention to the potential of machine-learning and text analysis methods to psychological research. Presenter notes are included. There are four sections to this presentation: Understanding hate speech, Psychological effects of hate speech, Datasets, and Machine-learning models. One activity is included which references Assignment_Coded Language. The online database Hatebase is used as part of this activity and it is recommended that students have access to a laptop with connection to the internet to complete the activity.
28 slides, including bibliography and alternative activity. This slide deck is intended to provide a more in-depth examination of text analysis of online hate speech. It describes keyword identification, cross-reference of datasets, several case studies of relevant programs and datasets, methods for gathering data, research ethics pertaining to online content, and compassion fatigue among researchers and participants. Two activities are included which reference: Assignment_Coded Language or Assignment_Online Observation. The Github repository Hurtlex is used as part of these activities and it recommended that students have access to a laptop with connection to the internet to complete the activities.
This assignment directs students to explore and utilize the online database Hatebase in order to analyze an example of online cotent containing covert hate speech (de-identified examples are provided). This exercise allows students to become familiar with the use of online hate speech lexicons and to consider multiple aspects of hate speech text analysis - including poster intent, in-group/out-group identities, and the possible challenges of moderation or analysis. This is recommended as the first assignment.
This assignment give students the opportunity to study the real world posts of individuals who circulate hate speech through passive, online observation. Several links and options are provided for suitable online spaces for this task. It is recommended that students complete Assignment_Coded Language or have other opportunities to learn how to use Hatebase before completing this assignment.
This assignment provides several examples of hate symbols (visualized hate speech) for analysis. Students are directed to utilize the ADL Glossary of Extremism and/or the ADL Hate Symbols Database to gain context about their selected visual. In completing this exercise students become familiar with two new lexicons, and continue to practice analyzing the psychological intent and motivations behind posting hateful content.
This folder contains several Jupyter notebooks which demonstrate the use of Python to explore a dataset containing offensive speech along with other attributes. Learning goals include:
• Organize the data in Pandas dataframe, a key data structure of Python
• Provide some descriptive statistics about the dataset
• Prepare the selection containing offensive speeches for further texual analysis by removing stopwords, lowercasing, stemming, etc.
• Use CountVectorizer module to convert the collection of speeches (or, comments/tweets/documents, etc) into a matrix containing term or word frequencies
• Calculate TF-IDF score of each term or word to determine the importance of that term in a specific selection