Skip to content

Commit

Permalink
Adding week 03 lecture, lab, and data
Browse files Browse the repository at this point in the history
  • Loading branch information
eveskew committed Feb 9, 2020
1 parent 699ab57 commit b2e0e21
Show file tree
Hide file tree
Showing 7 changed files with 2,323 additions and 1 deletion.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ drafts/*
homework/*.pdf
homework_keys/*
lab/*.pdf
lab/working_directory_demo.Rmd
lab_keys/*
paperwork/*
scripts/*
1,705 changes: 1,705 additions & 0 deletions data/gapminder.csv

Large diffs are not rendered by default.

63 changes: 63 additions & 0 deletions data/mammals.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
"species","body","brain"
"Arctic fox",3.385,44.5
"Owl monkey",0.48,15.5
"Mountain beaver",1.35,8.1
"Cow",465,423
"Grey wolf",36.33,119.5
"Goat",27.66,115
"Roe deer",14.83,98.2
"Guinea pig",1.04,5.5
"Verbet",4.19,58
"Chinchilla",0.425,6.4
"Ground squirrel",0.101,4
"Arctic ground squirrel",0.92,5.7
"African giant pouched rat",1,6.6
"Lesser short-tailed shrew",0.005,0.14
"Star-nosed mole",0.06,1
"Nine-banded armadillo",3.5,10.8
"Tree hyrax",2,12.3
"N.A. opossum",1.7,6.3
"Asian elephant",2547,4603
"Big brown bat",0.023,0.3
"Donkey",187.1,419
"Horse",521,655
"European hedgehog",0.785,3.5
"Patas monkey",10,115
"Cat",3.3,25.6
"Galago",0.2,5
"Genet",1.41,17.5
"Giraffe",529,680
"Gorilla",207,406
"Grey seal",85,325
"Rock hyrax-a",0.75,12.3
"Human",62,1320
"African elephant",6654,5712
"Water opossum",3.5,3.9
"Rhesus monkey",6.8,179
"Kangaroo",35,56
"Yellow-bellied marmot",4.05,17
"Golden hamster",0.12,1
"Mouse",0.023,0.4
"Little brown bat",0.01,0.25
"Slow loris",1.4,12.5
"Okapi",250,490
"Rabbit",2.5,12.1
"Sheep",55.5,175
"Jaguar",100,157
"Chimpanzee",52.16,440
"Baboon",10.55,179.5
"Desert hedgehog",0.55,2.4
"Giant armadillo",60,81
"Rock hyrax-b",3.6,21
"Raccoon",4.288,39.2
"Rat",0.28,1.9
"E. American mole",0.075,1.2
"Mole rat",0.122,3
"Musk shrew",0.048,0.33
"Pig",192,180
"Echidna",3,25
"Brazilian tapir",160,169
"Tenrec",0.9,2.6
"Phalanger",1.62,11.4
"Tree shrew",0.104,2.5
"Red fox",4.235,50.4
107 changes: 107 additions & 0 deletions lab/lab_week_03.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
---
title: "EEEB UN3005/GR5005 \nLab - Week 03 - 10 and 12 February 2020"
author: "USE YOUR NAME HERE"
output: pdf_document
fontsize: 12pt
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(dplyr)
library(ggplot2)
```


# Data Processing and Visualization


## Exercise 1: Importing and Cleaning Snake Capture Data

In class you've already seen the `ebay_snake_captures` dataset which shows snake capture results for approximately a year of sampling at one wetland in South Carolina, [Ellenton Bay](http://archive-srel.uga.edu/set-asides/area1.html). As before, you can find this data as a CSV file (`ebay_snake_captures.csv`) on the class CourseWorks site (if you don't already have it). Import this data into R, assigning it to an object called `e`. Rename the six columns of this data frame as follows: "date", "time", "trap_type", "species", "count", "comments". Use `head()` to confirm your change of column names.

```{r}
```


## Exercise 2: Working With Dates

Dates can be very tricky to work with in R. Think about the general issues we might have. In everyday usage, we sometimes refer to dates using the names of days of the week and months of the year. In other cases, we represent the same data using just numbers (e.g., days 1-31, months 1-12). And in different parts of the world, people use different conventions when writing out dates (e.g., some put the month first, others the day).

Use the function `str()` to examine the structure of the `date` column in the `e` data frame. How is R currently representing this data?

```{r}
```

Create a modified date column in `e` called `date_mod` using the following code: `as.Date(as.character(e$date), format = "%d-%b-%y")`. What is the structure of your new `date_mod` column?

```{r}
```


## Exercise 3: Creating Monthly Summary Capture Counts

Given that the `e` data contains information on snake captures throughout the year, one might naturally be interested in how snake captures vary over time. One way to do this would be to summarize how many of each snake species were captured in a given month. However, right now, our data frame represents an even finer scale of data: each row of data represents a specific day and time that a snake species was captured rather than a monthly summary count.

To get data appropriate for downstream use, first, create a new variable in your `e` data frame called `month_of_capture` that indicates the month of the given observation using the following code: `as.numeric(format(e$date_mod, "%m"))`.

Next, create a data frame called `e2` that represents each unique month-snake species combination found within `e` and the associated total capture count (call this new variable `monthly_capture_count`).

Should `e2` have more or fewer rows of data than `e`? Can you show this is the case? Use some summary functions to investigate `e2` and ensure you have the appropriate dataset for further analyses.

```{r}
```


## Exercise 4: Bar Charts

One way you might want to visualize snake captures over time in the `e2` data is with a bar chart. Generate a bar chart using `ggplot()`. The month of capture should appear on the x-axis and capture counts on the y-axis.

```{r}
```

What is (roughly) the maximum bar height you see displayed? How does this compare with the maximum value of `monthly_capture_count` in the `e2` data? Why is this the case?


## Exercise 5: Scatter Plots

Now let's examine the `e2` data with scatter plots. Using `ggplot()`, generate a scatter plot of `monthly_capture_count` against `month_of_capture`. Additionally, build the plot such that the color of the data points corresponds to `species`.

```{r}
```

Notice how the x-axis is rather ugly? Let's work step-by-step to make this look better. First, generate a vector named `my.breaks` that contains the numbers 1, 4, 7, and 10. Next, examine the `month.abb` vector that is built into R. Generate a vector called `my.labels` that contains the first, fourth, seventh, and tenth elements of `month.abb`.

```{r}
```

Now that we have these vectors, let's use them to modify the look of our plot. We can specify where on our plot the x-axis labels should appear and what they should be labelled using the layer `scale_x_continuous()`. The relevant arguments are `breaks` (controlling where the x-axis labels land) and `labels` (controlling what the labels read). Regenerate your previous plot but with `scale_x_continuous()` added, with `breaks` equal to `my.breaks` and `labels` equal to `my.labels`.

```{r}
```

Now, instead of distinguishing species based on color, create a facetted plot for each species in the dataset.

```{r}
```

Look at the y-axes of the various plots you've produced. By default, `ggplot()` will show all facets with the same y-axis range. However, you can see that one species in the dataset, *Seminatrix pygaea* (check 'em out [here](https://srelherp.uga.edu/snakes/sempyg.htm)) has by far the highest monthly capture count, which means all other species' data is relatively difficult to inspect by comparison. Modify your previous plot to exclude *Seminatrix pygaea* so that any variation in other species' data will be more apparent.

```{r}
```


## Bonus Exercise: Install `rethinking`

If you have not yet installed the `rethinking` package, now would be a good time to try to do so, using the instructions at https://github.com/rmcelreath/rethinking.
Loading

0 comments on commit b2e0e21

Please sign in to comment.