Skip to content

dataobservatory-eu/new-contributors

Repository files navigation

dataobservatory dataobservatory on Github R package iotables R package retroharmonize R package regions R package dataset R package spotifyr R package statcodelists Contributor Covenant

New Contributors

Welcome to our dataobservatory.eu R, hugo, and open data ecosystem. We are very happy to guide you to the experience of open source development and open knowledge management regardless of your experience level with R or Github. We kindly ask you to take the Contributor Covenant Pledge before starting our collaboration.

“We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, caste, color, religion, or sexual identity and orientation.

We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community.”

Please read the entire covenant here.

  • Name, affiliation, education details, one-line and short biography. Please, send back this bio_template.txt text file with your details or, if you know markdown, use this version. The files are identical, but your word processor may not know how to open an .md file.
  • Your ORCiD to resolve ambiguity with similarly named people. You may use different library or publication service IDs, such as Google Scholar, Publeon, etc, you may provide them, too, but we do need an ORCiD ID, because most of the EU open science infrastructure and the R ecosystem uses this one. If you do not have it, please create one—it only takes a few minutes. Please add it to the bio_template.txt.
  • Your LinkedIn ID, add it to the bio_template.txt.
  • Your Github account name. If you do not have one, please create one. As a data curator, you may not need it, but if you contribute in our R&D or publication efforts, you will need it.
  • Your Keybase account name. If you do not have one, please create one, if you want to be able to chat with us, or exchange calls, data with us in a discrete, free, open-source and secure environment. Keybase is an open-source substitute for Slack. It is owned by Zoom and it can start Zoom or Google Meet calls.
  • Twitter account name, provided that you use Twitter for professional uses.
  • We are seeking an affiliation with mastodon.green, which will be a climate positive, decentralized and ethical alternative to other social media. More details soon.
  • Facebook acccount name, provided that you use Facebook for professional uses.
  • Any other social media that you use strictly professionally.
  • You should follow our file naming conventions, and avoid the use of special characters in any file names at all times: , $, :,;,,,., ", ' tick or backtick.
  • Please send us one professional portrait of at least 400x400 pixels.

Data curators

Data curators do not need to be knowledgeable about data science or programming. They should have a strong domain-specific knowledge and interest in empirical data collection and data quality in their professional or research areas.

As a data curator, we will rely on your expertise to publish or release new data. Therefore, we need a filled in curator biography template with the following information about you.

You should get familiar with the following concepts. We will describe them in blogposts.

  • FAIR Principles: improve the Findability, Accessibility, Interoperability, and Reuse of digital assets.
  • DataCite: A persistent, standardized approach to access, identification, sharing, and re-use of datasets—this is our favored way of describing data for future use according to the FAIR principles. Many EU open science repositories will ask your publications with this documentation.
  • Biblatex is a standard text file used by citation engines, bibliography management tool, and in scientific publication templates. (See for example the Overleaf Biblatex tutorial.
  • Dublin Core is an older international standard than DataCite, but the two standards greatly overlap. Dublin Core was originally developed by libraries. You often may need to fill out Dublin Core properties for publication.
  • You should follow our file naming conventions, and avoid the use of special characters in any file names at all times: , $, :,;,,,., ", ' tick or backtick.

If you want to contribute, co-author in our publication activities

For co-authorship, you should be familar with tools that help the assynchronous co-writing of papers. We use Github for mainly this purpose.

Additionally we need this from you:

  • Your Github account name. Still not a must, but eventually it is in your interest to be able to work with Git.
  • Gaining familiarity with the TeX format for scientific publishing, exporting your citations to Biblatex format.
  • Share citations with us with Zotero, and open-source bibliography management tool that integrates well with browsers. Share your Zotero account name with us.
  • You should follow our file naming conventions, and avoid the use of special characters in any file names at all times: , $, :,;,,,., ", ' tick or backtick.

We will softly onboard you if you are not familiar with Gitbhub, you can start collaborating us in Google Docs.

Co-creation with Github

As an author of an article, paper, or software, you will sooner or later work with Github. All our main source files (both documents and software) are stored, shared on Github repositories. Github repositories (repos) are folders that can be synchronized with many collaborators, who can work on tasks parallel without overwriting each other’s work.

  • Your Github account name
  • For Windows users, it is recommended to install Github Desktop.

R software testing, documentation or development

Get your computer ready for co-working:

  • You will need the crossplatform Java programming environment on your computer. It is cross-platform and facilitates the use of Linux, BSD/Mac OX and Windows collaboration. You most probably have it. It is a good opportunity to check if you have the latest version. If not, do upgrade, both for security and functionality reasons, and at the same time remove the old versions. Follow this link https://www.java.com/en/download/.

  • If you recently installed R, you most likely have the latest version. If not, then run install.packages(“installr”) and run installr::check.for.updates.R(). If there is a newer R release, you should upgrade. installr::updateR() will take you through the progress, including the moving of your already installed packages to the new R installation, however, it will not remove the old R environment. You should run the upgrader from the R GUI (you will find this somewhere on your computer, even though you may have forgotten about it because you always use R from RStudio.)

  • The copying of the old R packages is not always successful. You can prepare for this by saving the list of installed packages before your I do not my reinstalling though my packages. It reminds me to remove detritus, and review my own developments.

  • One package that is worth running at all new installs is tinytex. tinytex::reinstall_tinytex() or tinytex::install_tinytex(). Tinytex is a lightweight tex engine, and it will allow many tex libraries from CTAN, such as fonts, formatting tools for TeX, and so on. This is required for an efficient creation of PDF files, in package documentation or elsewhere.

  • Now, when you have the latest version of R, install Rtools, too. https://cran.r-project.org/bin/windows/Rtools/

  • Now install RStudio, or, if you already have it, check if you have the latest version. (Help Menu, Check for Updates.)

  • Install the usethis and devtools packages with all their dependencies. You should run install.packages(“devtools”) and see if all dependencies install without error. If not, you must figure out why some components are not installing.

Connect to the Github collaboration platform via RStudio

RStudio is one of the best integrated development environments in the world. It facilitates cross-language development, you can simultaneously wok on R, Python, C++ (RCpp), SQL, D3, Stan code and text, and even make them work together.

  • You must connect your RStudio to your Github account. If you already have a Github account, but you have not used it recently, or did not connect RStudio to it lately, you are likely to have to do it again.

  • Github does not support password authentication since August 13, 2021. This means that you cannot synchronize your offline and online repository using your username and password combination.

  • You can no longer synchronize the repository on RStudio with a repository URL only. For example, to synchronize https://github.com/rOpenGov/retroharmonize, you must explicitly state on your computer to synchronize via this URI: [email protected]:rOpenGov/retroharmonize.git, which will require the use of a

  • Happy Git and GitHub for the useR guides you through the process on Linux, Mac OS or Windows platforms. Put this into practice at the end of this document.

Microskills to pick up or improve:

  • You must be able to raise an issue via Github. An issue can be a bug report, a suggestions to change how a code works, or a suggestion to add, improve, change documentation.

  • You must be able to read a response to an issue, and accept a solution offered by somebody.

  • You must be able to read our issue/taks cards on our kanban-style Github Project management tool.

  • You should be able to write, move, solve cards in the Github Project.

  • Learn how to improve our software documentation in Rmd and R files.

  • You should learn to write a so-called reprex to correctly report a bug.

  • You must use file.path or here from the here package to use computer- and operational system independent file paths.

  • You should follow our file naming conventions, and avoid the use of special characters in any file names at all times: , $, :,;,,,., ", ' tick or backtick.

  • Use goodpractice to improve the code quality and readability.

  • Most of our packages depend on various components of the tidyverse, dplyr, tidyr, and purrr.

  • These packages depend among others on rlang for the .data pronoun and magrittr for the pipe operator.

  • When using non-standard evaluation, use the modern evaluation practice of the Tidyverse and rlang, and avoid the old . pronoun, but the more precise .data pronoun. Use the .data$foo reference style. Instead of select(df, geo) use select(df, .data$geo).

Handshake

  • Run into a problem? We use the open-source and encyrpted, privacy-sensitive competitor of Slack, Keybase. You can ask for help in our Keybase Community.

  • For any of our repositories that you would like to contribute to into your own Github profile, for example, https://github.com/rOpenGov/retroharmonize/ to yourusername/retroharmonize.

  • Send a pull request when you have something to commit to our work.

Put this into practice

  1. Star this repo: dataobservatory-eu/new-contributors

Thanks! It is similar on social media to giving us a like or a 🧡.

  1. For this repo into your own space on Github, i.e. create a copy that you can modify or download to your computer.
knitr::include_graphics(file.path("png", "fork_this_repo.png"))

After pressing fork, you can make a copy to https://github.com/<your-github-id>/new-contributors. This your copy, and if you have followed the instruction, you can download it to your computer and edit the document with in RStudio or any text editor.

  1. Synchronize with R Studio. By navigating to File Menu -> New Project -> Version control -> Git You will end up with this dialog box.

If you have followed the Happy Git and GitHub for the useR, you have built up a secure authentication workflow that will work a bit differently than on Linux, because of deeper differences among the operational systems.

  • On Windows, you paste the https:// protocol URL of your github fork, i.e. https://github.com/<your-github-id>/new-contributors. instead of https://github.com/dataobservatory-eu/new-contributors shown below.
knitr::include_graphics(file.path("png", "synchronize_with_rstudio.png"))

knitr::include_graphics(file.path("png", "synchronize_with_r.png"))

Whichever URL you copy into the RStudio, you will be able to download the repository contents with the Pull button (blue arrow down.)

knitr::include_graphics(file.path("png", "pull_push_with_rstudio.png"))

Once you have all the files present, add an emoji or a sentece to the back of the README.Rmd file, and tick the Commit checkbox near the name of the file. [When there are Rmd and md files present, always edit the Rmd, which will generate the md but not the other way around.]

If you press the Push button (green arrow up), things should upload to https://github.com/<your-github-id>/new-contributors without asking your github username and password. Why? Because you can download, at least from public repositories, without authentication anytime. You can even download a repo in a .zip file in your browser. However, Github since 2021 does no longer allow writing into a repository with password authentication, only via the far more secure SSH. If you followed the Happy Git and GitHub for the useR, your computer, including R Studio, should be able to download (pull) and upload (push) back files with SSH authentication and not with a password. The exact implementation of the SSH authentication is slightly different on Windows, Mac/BSD, and Linux/Unix systems.

If you are still being asked for a password, then you are out of luck. You can write in your github password, but you will get a message that Github no longer accepts “pushing” back files with a password authenticaion. In this case you must troubleshoot why RStudio is not aware of your PAT token used for SSH authenticaion.