Skip to content

Commit

Permalink
Adding week 02 lecture and lab
Browse files Browse the repository at this point in the history
  • Loading branch information
eveskew committed Feb 3, 2020
1 parent b7033ea commit 9cb289e
Show file tree
Hide file tree
Showing 4 changed files with 573 additions and 1 deletion.
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@
.Ruserdata
*.DS_Store

drafts/*
homework/*.pdf
homework_keys/*
lab/*.pdf
lab_keys/*
lecture_drafts/*
paperwork/*
scripts/*
94 changes: 94 additions & 0 deletions data/Woolhouse_and_Brierley_RNA_virus_database_reduced.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
"Species","Genus","Family","Envelope","Genome","Discovery.year","Vector","Inhalation","Ingestion","Sexual","Iatrogenic..inc..blood.","Fomites","Broken.Skin","Maternal","Direct.Contact","Transmission.level","Person.to.person","Host.range","Human.only","Non.human.primate","Other.mammals","Birds","Reptiles","Fish"
"Chapare mammarenavirus","Mammarenavirus","Arenaviridae",1,"(-)ssRNA",2008,"0","1","0","0","0","0","0","0","0","2","0","broad","0","0","1","0","0","0"
"Guanarito mammarenavirus","Mammarenavirus","Arenaviridae",1,"(-)ssRNA",1991,"0","1","0","0","0","0","0","0","0","3","1","broad","0","0","1","0","0","0"
"Junín mammarenavirus","Mammarenavirus","Arenaviridae",1,"(-)ssRNA",1958,"0","1","0","0","0","0","0","0","0","3","1","broad","0","0","1","0","0","0"
"Lassa mammarenavirus","Mammarenavirus","Arenaviridae",1,"(-)ssRNA",1970,"0","1","0","0","0","0","0","1","0","3","1","broad","0","0","1","0","0","0"
"Lujo mammarenavirus","Mammarenavirus","Arenaviridae",1,"(-)ssRNA",2009,"0","1","0","0","0","1","0","0","1","3","1","broad","0","0","1","0","0","0"
"Lymphocytic choriomeningitis mammarenavirus","Mammarenavirus","Arenaviridae",1,"(-)ssRNA",1934,"0","1","0","0","0","0","0","1","0","3","1","broad","0","1","1","0","0","0"
"Machupo mammarenavirus","Mammarenavirus","Arenaviridae",1,"(-)ssRNA",1964,"0","1","0","0","0","0","0","0","0","3","1","broad","0","0","1","0","0","0"
"Mobala mammarenavirus","Mammarenavirus","Arenaviridae",1,"(-)ssRNA",1985,"0","1*","0","0","0","0","0","0","0","2","0","broad","0","0","1","0","0","0"
"Pichindé mammarenavirus","Mammarenavirus","Arenaviridae",1,"(-)ssRNA",1974,"0","1*","0","0","0","0","0","0","0","2","0","broad","0","0","1","0","0","0"
"Sabiá mammarenavirus","Mammarenavirus","Arenaviridae",1,"(-)ssRNA",1994,"0","1","0","0","0","0","0","0","0","3","1","broad","0","0","1","0","0","0"
"Whitewater Arroyo mammarenavirus","Mammarenavirus","Arenaviridae",1,"(-)ssRNA",2000,"0","1","0","0","0","0","0","0","0","2","0","broad","0","0","1","0","0","0"
"Mamastrovirus 1","Mamastrovirus","Astroviridae",0,"(+)ssRNA",1975,"0","0","1","0","0","0","0","0","0","4b","1","narrow","1","0","0","0","0","0"
"Mamastrovirus 6","Mamastrovirus","Astroviridae",0,"(+)ssRNA",2008,"0","0","1","0","0","0","0","0","0","4b","1","narrow","1","0","0","0","0","0"
"Mamastrovirus 8","Mamastrovirus","Astroviridae",0,"(+)ssRNA",2009,"0","0","1","0","0","0","0","0","0","4b","1","narrow","1","0","0","0","0","0"
"Mamastrovirus 9","Mamastrovirus","Astroviridae",0,"(+)ssRNA",2009,"0","0","1","0","0","0","0","0","0","4b","1","narrow","1","0","0","0","0","0"
"Alphacoronavirus 1","Alphacoronavirus","Coronaviridae",1,"(+)ssRNA",2007,"0","0","1*","0","0","0","0","0","0","2","0","broad","0","0","1","0","0","0"
"Human coronavirus 229E ","Alphacoronavirus","Coronaviridae",1,"(+)ssRNA",1966,"0","1","0","0","0","1","0","0","0","4b","1","narrow","1","0","0","0","0","0"
"Human coronavirus NL63","Alphacoronavirus","Coronaviridae",1,"(+)ssRNA",2004,"0","1","0","0","0","1","0","0","0","4b","1","narrow","1","0","0","0","0","0"
"Betacoronavirus 1","Betacoronavirus","Coronaviridae",1,"(+)ssRNA",1967,"0","1","0","0","0","1","0","0","0","4a","1","broad","0","0","1","0","0","0"
"Human coronavirus HKU1","Betacoronavirus","Coronaviridae",1,"(+)ssRNA",2005,"0","1","0","0","0","1","0","0","0","4b","1","narrow","1","0","0","0","0","0"
"Middle East respiratory syndrome-related coronavirus","Betacoronavirus","Coronaviridae",1,"(+)ssRNA",2012,"0","1","0","0","0","0","0","0","1","3","1","broad","0","0","1","0","0","0"
"Severe acute respiratory syndrome-related coronavirus","Betacoronavirus","Coronaviridae",1,"(+)ssRNA",2003,"0","1","0","0","0","0","0","0","0","4a","1","broad","0","0","1","0","0","0"
"Human torovirus","Torovirus","Coronaviridae",1,"(+)ssRNA",1984,"0","0","1","0","0","0","0","0","0","4b","1","narrow","1","0","0","0","0","0"
"Bundibugyo ebolavirus","Ebolavirus","Filoviridae",1,"(-)ssRNA",2008,"0","0","0","0","0","0","1","0","1","3","1","broad","0","0","1","0","0","0"
"Reston ebolavirus ","Ebolavirus","Filoviridae",1,"(-)ssRNA",1991,"0","0","0","0","0","0","1","0","1","2","0","broad","0","1","1","0","0","0"
"Sudan ebolavirus ","Ebolavirus","Filoviridae",1,"(-)ssRNA",1977,"0","0","0","0","0","1","1","0","1","3","1","broad","0","1","1","0","0","0"
"Tai forest ebolavirus","Ebolavirus","Filoviridae",1,"(-)ssRNA",1995,"0","0","0","0","0","1","1","0","1","2","0","broad","0","1","1","0","0","0"
"Zaire ebolavirus","Ebolavirus","Filoviridae",1,"(-)ssRNA",1977,"0","0","0","1","0","1","1","1","1","4a","1","broad","0","1","1","0","0","0"
"Marburg marburgvirus","Marburgvirus","Filoviridae",1,"(-)ssRNA",1968,"0","0","0","1","0","1","1","1","1","3","1","broad","0","1","1","0","0","0"
"Aroa virus","Flavivirus","Flaviviridae",1,"(+)ssRNA",1971,"1","0","0","0","0","0","0","0","0","2","0","broad","0","0","1","0","0","0"
"Bagaza virus","Flavivirus","Flaviviridae",1,"(+)ssRNA",2009,"1","0","0","0","0","0","0","0","0","2","0","broad","0","0","0","1","0","0"
"Banzi virus ","Flavivirus","Flaviviridae",1,"(+)ssRNA",1959,"1","0","0","0","0","0","0","0","0","2","0","broad","0","0","1","0","0","0"
"Cacipacore virus","Flavivirus","Flaviviridae",1,"(+)ssRNA",2011,"1","0","0","0","0","0","0","0","0","2","0","broad","0","0","1","1","0","0"
"Dengue virus ","Flavivirus","Flaviviridae",1,"(+)ssRNA",1907,"1","0","0","0","0","0","0","1","0","4a","1","narrow","0","1","0","0","0","0"
"Edge Hill virus","Flavivirus","Flaviviridae",1,"(+)ssRNA",1985,"1","0","0","0","0","0","0","0","0","2","0","broad","0","0","1","0","0","0"
"Gadgets Gully virus","Flavivirus","Flaviviridae",1,"(+)ssRNA",1991,"1","0","0","0","0","0","0","0","0","2","0","broad","0","0","0","1","0","0"
"Ilheus virus ","Flavivirus","Flaviviridae",1,"(+)ssRNA",1947,"1","0","0","0","0","0","0","0","0","2","0","broad","0","0","1","1","0","0"
"Japanese encephalitis virus ","Flavivirus","Flaviviridae",1,"(+)ssRNA",1933,"1","0","0","0","0","0","0","1","0","3","1","broad","0","1","1","1","1","0"
"Kokobera virus ","Flavivirus","Flaviviridae",1,"(+)ssRNA",1985,"1","0","0","0","0","0","0","0","0","2","0","broad","0","0","1","0","0","0"
"Kyasanur forest disease virus ","Flavivirus","Flaviviridae",1,"(+)ssRNA",1957,"1","0","0","0","0","0","0","0","0","2","0","broad","0","1","1","0","0","0"
"Langat virus","Flavivirus","Flaviviridae",1,"(+)ssRNA",1956,"1","0","0","0","0","0","0","0","0","2","0","broad","0","0","1","0","0","0"
"Louping ill virus ","Flavivirus","Flaviviridae",1,"(+)ssRNA",1934,"1","0","0","0","0","0","1","0","1","2","0","broad","0","0","1","1","0","0"
"Murray Valley encephalitis virus ","Flavivirus","Flaviviridae",1,"(+)ssRNA",1952,"1","0","0","0","0","0","0","0","0","2","0","broad","0","0","1","1","0","0"
"Ntaya virus","Flavivirus","Flaviviridae",1,"(+)ssRNA",1952,"1","0","0","0","0","0","0","0","0","2","0","broad","0","0","1","1","0","0"
"Omsk hemorrhagic fever virus ","Flavivirus","Flaviviridae",1,"(+)ssRNA",1947,"1","1","0","0","0","0","0","0","0","2","0","broad","0","0","1","0","0","0"
"Powassan virus ","Flavivirus","Flaviviridae",1,"(+)ssRNA",1959,"1","0","0","0","0","0","0","0","0","2","0","broad","0","0","1","1","1","0"
"Rio Bravo virus ","Flavivirus","Flaviviridae",1,"(+)ssRNA",1962,"0","1","0","0","0","0","0","0","0","2","0","broad","0","0","1","0","0","0"
"St. Louis encephalitis virus ","Flavivirus","Flaviviridae",1,"(+)ssRNA",1933,"1","0","0","0","0","0","0","0","0","2","0","broad","0","1","1","1","1","0"
"Tembusu virus","Flavivirus","Flaviviridae",1,"(+)ssRNA",1975,"1","0","0","0","0","0","0","0","0","2","0","broad","0","0","0","1","0","0"
"Tick-borne encephalitis virus ","Flavivirus","Flaviviridae",1,"(+)ssRNA",1938,"1","0","0","0","0","0","0","0","0","2","0","broad","0","1","1","1","0","0"
"Uganda S virus","Flavivirus","Flaviviridae",1,"(+)ssRNA",1952,"1","0","0","0","0","0","0","0","0","2","0","broad","0","0","1","0","0","0"
"Usutu virus","Flavivirus","Flaviviridae",1,"(+)ssRNA",2009,"1","0","0","0","1","0","0","0","0","3","1","broad","0","0","1","1","0","0"
"Wesselsbron virus ","Flavivirus","Flaviviridae",1,"(+)ssRNA",1957,"1","0","0","0","0","0","0","0","0","2","0","broad","0","0","1","0","0","0"
"West Nile virus ","Flavivirus","Flaviviridae",1,"(+)ssRNA",1940,"1","0","0","0","0","0","0","1","0","3","1","broad","0","1","1","1","1","0"
"Yellow fever virus ","Flavivirus","Flaviviridae",1,"(+)ssRNA",1901,"1","0","0","0","0","0","0","0","0","4a","1","broad","0","1","1","0","0","0"
"Zika virus ","Flavivirus","Flaviviridae",1,"(+)ssRNA",1954,"1","0","0","1","0","0","0","1","0","4a","1","broad","0","1","1","0","0","0"
"Hepacivirus C","Hepacivirus","Flaviviridae",1,"(+)ssRNA",1989,"0","0","0","1","1","0","1","1","1","4b","1","narrow","1","0","0","0","0","0"
"Pegivirus A","Pegivirus","Flaviviridae",1,"(+)ssRNA",1995,"0","0","0","1","1","0","0","1","1","4b","1","narrow","1","0","0","0","0","0"
"Bovine viral diarrhea virus 1","Pestivirus","Flaviviridae",1,"(+)ssRNA",1988,"0","1*","0","0","0","1*","0","0","1*","2","0","broad","0","0","1","0","0","0"
"Influenza A virus","Influenzavirus A","Orthomyxoviridae",1,"(-)ssRNA",1933,"0","1","0","0","0","0","0","1","1","4a","1","broad","0","1","1","1","0","0"
"Influenza B virus ","Influenzavirus B","Orthomyxoviridae",1,"(-)ssRNA",1940,"0","1","0","0","0","0","0","0","1","4a","1","broad","0","0","1","0","0","0"
"Influenza C virus ","Influenzavirus C","Orthomyxoviridae",1,"(-)ssRNA",1950,"0","1","0","0","0","0","0","0","1","4a","1","broad","0","0","1","0","0","0"
"Dhori virus","Thogotovirus","Orthomyxoviridae",1,"(-)ssRNA",1985,"1","0","0","0","0","0","0","0","0","2","0","broad","0","0","1","1","0","0"
"Thogoto virus ","Thogotovirus","Orthomyxoviridae",1,"(-)ssRNA",1969,"1","0","0","0","0","0","0","0","0","2","0","broad","0","0","1","0","0","0"
"Avian avulavirus 1 ","Avulavirus","Paramyxoviridae",1,"(-)ssRNA",1943,"0","0","1","0","0","1","0","0","1","2","0","broad","0","0","0","1","0","0"
"Hendra henipavirus ","Henipavirus","Paramyxoviridae",1,"(-)ssRNA",1995,"0","0","0","0","0","0","0","0","1","2","0","broad","0","0","1","0","0","0"
"Nipah henipavirus ","Henipavirus","Paramyxoviridae",1,"(-)ssRNA",1999,"0","1","1","0","0","0","0","0","1","3","1","broad","0","0","1","0","0","0"
"Canine morbillivirus","Morbillivirus","Paramyxoviridae",1,"(-)ssRNA",1955,"0","1*","0","0","0","1","0","0","1*","2","0","broad","0","1","1","0","0","0"
"Measles morbillivirus ","Morbillivirus","Paramyxoviridae",1,"(-)ssRNA",1911,"0","1","0","0","0","0","0","1","1","4a","1","narrow","0","1","0","0","0","0"
"Human respirovirus 1 ","Respirovirus","Paramyxoviridae",1,"(-)ssRNA",1958,"0","1","0","0","0","0","0","0","0","4a","1","narrow","0","1","0","0","0","0"
"Human respirovirus 3 ","Respirovirus","Paramyxoviridae",1,"(-)ssRNA",1958,"0","1","0","0","0","0","0","0","0","4a","1","narrow","0","1","0","0","0","0"
"Human rubulavirus 2 ","Rubulavirus","Paramyxoviridae",1,"(-)ssRNA",1956,"0","1","0","0","0","0","0","0","0","4a","1","narrow","0","1","0","0","0","0"
"Human rubulavirus 4","Rubulavirus","Paramyxoviridae",1,"(-)ssRNA",1960,"0","1","0","0","0","0","0","0","0","4b","1","narrow","1","0","0","0","0","0"
"Mammalian rubulavirus 5","Rubulavirus","Paramyxoviridae",1,"(-)ssRNA",1959,"0","1*","0","0","0","0","0","0","1*","2","0","broad","0","1","1","0","0","0"
"Mumps rubulavirus ","Rubulavirus","Paramyxoviridae",1,"(-)ssRNA",1934,"0","0","0","0","0","0","0","1","1","4b","1","narrow","1","0","0","0","0","0"
"Simian rubulavirus","Rubulavirus","Paramyxoviridae",1,"(-)ssRNA",1968,"0","1*","0","0","0","0","0","0","1*","2","0","unknown","?","?","?","?","?","?"
"Sosuga rubulavirus","Rubulavirus","Paramyxoviridae",1,"(-)ssRNA",2014,"0","1*","0","0","0","0","0","0","1*","2","0","narrow","0","0","1","0","0","0"
"Tioman rubulavirus","Rubulavirus","Paramyxoviridae",1,"(-)ssRNA",2007,"0","1","0","0","0","0","0","0","1","2","0","broad","0","0","1","0","0","0"
"Candiru phlebovirus","Phlebovirus","Phenuviridae",1,"(-)ssRNA",1983,"1","0","0","0","0","0","0","0","0","2","0","broad","0","0","1","0","0","0"
"Punta Toro phlebovirus ","Phlebovirus","Phenuviridae",1,"(-)ssRNA",1970,"1","0","0","0","0","0","0","0","0","2","0","broad","0","0","1","0","0","0"
"Rift Valley fever phlebovirus ","Phlebovirus","Phenuviridae",1,"(-)ssRNA",1931,"1","1","0","0","0","0","1","0","1","3","1","broad","0","0","1","0","0","0"
"Sandfly fever Naples phlebovirus ","Phlebovirus","Phenuviridae",1,"(-)ssRNA",1944,"1","0","0","0","0","0","0","0","0","2","0","broad","0","0","1","0","0","0"
"SFTS phlebovirus","Phlebovirus","Phenuviridae",1,"(-)ssRNA",2012,"1","0","0","0","1","0","0","0","1","3","1","broad","0","0","1","1","0","0"
"Uukuniemi phlebovirus","Phlebovirus","Phenuviridae",1,"(-)ssRNA",1970,"1","0","0","0","0","0","0","0","0","2","0","broad","0","0","1","1","0","0"
"Primate T-lymphotropic virus 1 ","Deltaretrovirus","Retroviridae",1,"ssRNA-RT",1980,"0","0","0","1","0","0","0","1","1","4a","1","narrow","0","1","0","0","0","0"
"Primate T-lymphotropic virus 2 ","Deltaretrovirus","Retroviridae",1,"ssRNA-RT",1982,"0","0","0","1","0","0","0","1","1","4a","1","narrow","0","1","0","0","0","0"
"Primate T-lymphotropic virus 3","Deltaretrovirus","Retroviridae",1,"ssRNA-RT",2005,"0","0","0","1","0","0","0","1","1","4a","1","narrow","0","1","0","0","0","0"
"Human immunodeficiency virus 1 ","Lentivirus","Retroviridae",1,"ssRNA-RT",1983,"0","0","0","1","1","0","1","1","1","4b","1","narrow","1","0","0","0","0","0"
"Human immunodeficiency virus 2 ","Lentivirus","Retroviridae",1,"ssRNA-RT",1986,"0","0","0","1","1","0","1","1","1","4b","1","narrow","1","0","0","0","0","0"
"Simian immunodeficiency virus","Lentivirus","Retroviridae",1,"ssRNA-RT",1992,"0","0","0","0","1","0","1","0","1","2","0","narrow","0","1","0","0","0","0"
"African green monkey simian foamy virus","Spumavirus","Retroviridae",1,"ssRNA-RT",1997,"0","0","0","0","0","0","1","0","1","2","0","narrow","0","1","0","0","0","0"
"Macaque simian foamy virus","Spumavirus","Retroviridae",1,"ssRNA-RT",2002,"0","0","0","0","0","0","1","0","1","2","0","narrow","0","1","0","0","0","0"
"Simian foamy virus","Spumavirus","Retroviridae",1,"ssRNA-RT",1971,"0","0","0","0","0","0","1","0","1","2","0","narrow","0","1","0","0","0","0"
127 changes: 127 additions & 0 deletions lab/lab_week_02.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
---
title: "EEEB UN3005/GR5005 \nLab - Week 02 - 03 and 05 February 2020"
author: "USE YOUR NAME HERE"
output: pdf_document
fontsize: 12pt
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(dplyr)
```


# Data Cleaning

To practice data cleaning, in this week's lab, we'll be using a subset of [published data](https://www.nature.com/articles/sdata201817) on RNA viruses collated by Mark Woolhouse and Liam Brierley. The entire dataset contains trait information gathered from the scientific literature on 214 RNA viruses that are known to infect humans. See the ["Data Records"](https://www.nature.com/articles/sdata201817#data-records) section of the published paper for information on the variables included in the full dataset. I've downloaded the data, converted it to a CSV file for your ease of use, and pulled out only a subset of the data to make it easier to work with. Our data subset contains information on 93 RNA viruses. Find the data subset on the class CourseWorks page as `Woolhouse_and_Brierley_RNA_virus_database_reduced.csv`.


## Exercise 1: Data Import

Download the Woolhouse and Brierley data, and import it into R, assigning it to an object named `viruses`. Run `summary()` on this object. You'll get a load of information in return, but this is just to familiarize yourself broadly with the dataset.

```{r}
```


## Exercise 2: Code Translation

For this series of exercises, you'll be given a chunk of code that does some data manipulation in base R. Your goal is to describe what this code is doing (in text below the code) and then translate that data manipulation operation using `dplyr` functions (in the empty code chunks). The `dplyr` solution will hopefully be simpler and more intuitive (which is why I'm encouraging you to learn `dplyr`). However, as an R user, you'll also be seeing lots of code written with base R functions, so best to be able to understand the basics of data manipulation with these built-in functions as well.

a)

- Base R code:

```{r}
viruses[viruses$Family == "Coronaviridae", ]
```

- `dplyr` equivalent:

```{r}
```

b)

- Base R code:

```{r}
viruses[1:10, c(1, 2, 3, 17)]
```

Hint: Look at the `dplyr` function called `slice()` using `?slice()`.

- `dplyr` equivalent:

```{r}
```

c)

- Base R code:

```{r}
sort(viruses$Species[viruses$Genome == "(+)ssRNA"])
```

- `dplyr` equivalent:

```{r}
```


## Exercise 3: Code Annotation

In the following series of exercises, you will be provided with functioning R code of `dplyr` data manipulation pipelines. Your goal is to comment these code blocks line-by-line, describing what each function is doing to create the final output. Please note, if you're not sure how a given line is functioning within the whole code block, this type of code is easily run in successively larger chunks. In other words, start by running the first line, then the first two lines, then the first three lines, etc. in order to see how the output changes. Additionally, reviewing function help files (e.g., `?some_function()`) may shed light on what's happening.

a)

```{r}
viruses %>%
mutate(Envelope_mod = ifelse(Envelope == 1, "enveloped", "not enveloped")) %>%
filter(Discovery.year >= 2000) %>%
select(Family, Species, Envelope_mod) %>%
arrange(Family, Species)
```

b)

```{r}
viruses %>%
group_by(Family) %>%
summarize(
n = n(),
n_enveloped = sum(Envelope),
proportion_enveloped = (n_enveloped/n)*100
) %>%
arrange(desc(n))
```

What do you notice about the `proportion_enveloped` column?

c)

```{r}
viruses %>%
group_by(Family) %>%
summarize(n_genome_types = n_distinct(Genome)) %>%
arrange(desc(n_genome_types))
```

What do you learn from this data summary about the number of distinct genome types per viral family?

## Bonus Exercise: Install `rethinking`

If you have not yet installed the `rethinking` package, now would be a good time to try to do so, using the instructions at https://github.com/rmcelreath/rethinking.
Loading

0 comments on commit 9cb289e

Please sign in to comment.