Skip to content

ExeterDigitalHumanities/openrefine

Repository files navigation

OpenRefine

Data, notes and slide deck for use with the Digital Humanities Lab's course on OpenRefine

OpenRefine can be downloaded from: https://openrefine.org/download.html

The Library Carpentry lesson, which we will refer to throughout the workshop, and provides a more in-depth tutorial on OpenRefine, can be found at: https://librarycarpentry.org/lc-open-refine/

Note that the Library Carpentry lessons use different example data, which is linked above.

More detailed notes and 'gotchas' for installing OpenRefine: InstallingOpenRefine.pdf, and see also the 'Setup' section in the Library Carpentry notes.

The actual notes from the workshop can be found at: Text II - OpenRefine.pdf. These will be uploaded soon after the workshop each time it runs.

Charles Woolf Collection

This is an export from the CALM catalogue management system, and holds metadata of the collection also displayed on JSTOR at: https://www.jstor.org/site/university-of-exeter/woolf/

The full dataset, exported directly from the catalogue, shows a number of features that can be collectively cleaned, and also some features that are more problematic to deal with, but is typical of the kind of data that OpenRefine can help to standardise and extract meaning from.

Click on the file above labelled CharlesWoolfSlideCollection_CALMexport.xlsx to download the data.

The images referred to in this metadata were catalogued by the team at Falmouth Archives, and are copyright. The slides themselves were transferred from the Estate of Charles Woolf to the Institute of Cornish Studies in 2016. Metadata is reproduced with permission.

Other Data

The Library Carpentry lesson also makes a more science-oriented dataset available on the setup page, feel free to use that dataset (especially if you're working through the LC materials independently)

About

Data for use with the Digital Humanities Lab's course on OpenRefine

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published