The enclosed documents include datasets comprising the results of the 2024 GitHub Open Source Survey (survey_data.csv, negative_incidents.csv) the full text of the English questionnaire (questionnaire.md), and notes for working with the data (notes.md). Free text responses have been removed to protect respondent privacy.
Because of the high level of visibility of negative incidents in the open source community, the combination of detailed incident information, demographic data, and contribution practices may make respondents identifiable. In order to maintain respondent anonymity, harassment and related questions have been unlinked from the rest of the data and the order has been randomized.
Researchers who have need to be able to link responses on the harassment questions to other variables in the dataset may contact us with a detailed explanation of their research needs, their plans for securing the data, and, if applicable, IRB approval.
The data here covers two distinct samples.
2017 GitHub.com sample: Between March 21 and 31, 2017, a small percentage of eligible visitors to licensed open source repositories on GitHub.com were invited to take the survey through a dialog box that linked to an off-site survey site. Eligibility was determined based on activity indicating sincere interest in open source projects (visits to 3 distinct projects or 3 clicks in a single project in 30 minutes). Invitations persisted across 3 subsequent page views or until dismissal. The introductory text on the survey landing page informed respondents that anonymous results would be publicly released as an open data set, all questions were optional, and provided instructions for accessing translated versions of the survey (available in Traditional Chinese, Japanese, Spanish, and Russian).
2024 GitHub.com sample: same eligibility criteria as 2017, but the survey was open from August 12-13, 2024.
Use of this data is licensed under the CC0 1.0 Universal License and governed by GitHub's Terms of Service (https://help.github.com/articles/github-terms-of-service/). Please note that while this data is public, our respondents have not waived their privacy rights. In particular, do not attempt to reidentify survey participants. Please contact us at https://github.com/contact with questions or concerns.
If you use this dataset in a publication, a link or citation would be appreciated. If you extend this dataset, we hope you'll share your additions as open data.