-
Notifications
You must be signed in to change notification settings - Fork 0
/
Week3.rmd
64 lines (55 loc) · 2.62 KB
/
Week3.rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
---
title: "Week3"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
###1.
Using the 173 majors listed in fivethirtyeight.com’s College Majors dataset [https://fivethirtyeight.com/features/the-economic-guide-to-picking-a-college-major/], provide code that identifies the majors that contain either "DATA" or "STATISTICS"
```{r}
data=read.csv('https://raw.githubusercontent.com/fivethirtyeight/data/master/college-majors/all-ages.csv',header=T)
head(data)
```
```{r}
library(dplyr)
library(tidyverse)
ds1 = data[str_detect(data$Major, regex("DATA",ignore_case = TRUE)) | str_detect(data$Major, regex("STATISTICS",ignore_case = TRUE)) ,]
head(ds1)
```
###2
Write code that transforms the data below:
[1] "bell pepper" "bilberry" "blackberry" "blood orange"
[5] "blueberry" "cantaloupe" "chili pepper" "cloudberry"
[9] "elderberry" "lime" "lychee" "mulberry"
[13] "olive" "salal berry"
Into a format like this:
c("bell pepper", "bilberry", "blackberry", "blood orange", "blueberry", "cantaloupe", "chili pepper", "cloudberry", "elderberry", "lime", "lychee", "mulberry", "olive", "salal berry")
```{r}
strng= paste("bell pepper", "bilberry", "blackberry", "blood orange", "blueberry", "cantaloupe", "chili pepper", "cloudberry", "elderberry", "lime", "lychee", "mulberry", "olive", "salal berry", sep=',')
strng = paste('c("', gsub(pattern = ",", replacement = '\",\"', strng), '")')
strng = gsub(pattern = '\" ', replacement = '\"', strng)
strng = gsub(pattern = ' \"', replacement = '\"', strng)
message(strng)
```
###3
Describe, in words, what these expressions will match:
(.)\1\1 - 1st capturing group - any char, match the same char as 1st group, match the same char as 1st group
"(.)(.)\\2\\1" - 1st capturing group any char, 2nd capturing group any char, match the same char as 2nd group, , match the same char as 1st group
(..)\1 - found all strings that have a repeated pair of letters.
"(.).\\1.\\1" - 1st capturing group any char, any char, repeat the same char twice
"(.)(.)(.).*\\3\\2\\1" - find three charters match in reverse order
###4
Construct regular expressions to match words that:
Start and end with the same character.
```{r}
str_view(c("qwq", "qwe"), "^q.*q$",match = TRUE)
```
Contain a repeated pair of letters (e.g. "church" contains "ch" repeated twice.)
```{r}
str_view(c("chur", "church", "chch"), "(..)\\1",match = TRUE)
```
Contain one letter repeated in at least three places (e.g. "eleven" contains three "e"s.)
```{r}
str_view(c("eleven", "church"), "(..)\\1{3}",match = TRUE)
```