Skip to content

Latest commit

 

History

History
453 lines (292 loc) · 5.69 KB

r_beginner_workshop.md

File metadata and controls

453 lines (292 loc) · 5.69 KB

"R/RStudio Beginner Workshop"

INSTALL

If you don't have R and RStudio, you can follow the link and install both.

https://posit.co/download/rstudio-desktop/

INTRODUCTION TO BASICS

Arithmetic with R


#An addition
10 + 7

# A subtraction
10 - 5 

# A multiplication
3 * 5

 # A division
(5 + 5) / 2 

# Exponentiation
2^5

# Modulo
28 %% 6

Variables


x <- 17
x

BASIC DATA TYPES IN R

Numerics & Integers

  • Decimal values like 1.7 are called numerics.

  • Whole numbers like 13 are called integers. Integers are also numerics.


a <- 1.7
b <- 13


#We can check the data type with class() function
class(a)
class(b)

Logical Operations


z <- a > b
z


class(z)

Character

  • Text (or string) values are called characters.

k <- "BMG Lab"
k


x <- as.character(x)
x


fname = "John"; lname = "Doe"
paste(fname, lname)

?paste

VECTORS

Vectors are one-dimension arrays that can hold numeric data, character data, or logical data. In other words, a vector is a simple tool to store data.

In R, you create a vector with the combine function c(). You place the vector elements separated by a comma between the parentheses.


#creating vectors
vector1 <- c(7, 13, 17, 20)
vector2 <- c("a", "b", "c", "d")
vector3 <- c(10, 20, 30, 40)

You can find out the lenght of vectors with lenght() funtion.

length(vector1)

#combining vectors
c(vector1, vector2)

Arithmetics with vectors

vector1
vector3
#multiply by 3
vector1 * 3
vector1 - 5
vector1 + vector3
2 * (vector1 + vector3) / 7

#using round()
round(2 * (vector1 + vector3) / 7, 2)

Vector Indexing

In R, indexing starts from 1 unlike python.


d <- c("A", "B", "C", "D", "E", "F")

d[2]

#Selecting values between indexes
d[2:4]
#Selecting values with multiple indexes
d[c(1,3,5)]

d[c(4,2,5)]
#Selecting values with logical indexing
d[c(FALSE, FALSE, TRUE, TRUE, FALSE, TRUE)]
#Removing a value from the vector
d[-3]

Naming A Vector

y <- c("Duygu", "KEREMİTÇİ")
y
names(y) <- c("Firt", "Last")
y

MATRIX

In R, a matrix is a collection of elements of the same data type (numeric, character, or logical) arranged into a fixed number of rows and columns.

You can construct a matrix in R with the matrix() function.

A <- matrix(c(5, 26, 13, 6, 7, 17),#the data
            nrow =2,#number of rows
            ncol = 3,#the number of columns
            byrow = TRUE) #fill matrix by row
A

Naming A Matrix

rownames(A) <- c("x","y")
colnames(A) <- c("a","b","c")
A

Selecting Values with Index

#A[row, column]
A[1, 3] 
#1st row
A[1, ]

#2nd column
A[,2]

Transpose The Matrix

#Creating a new matrix 
B <- matrix(c(2,4,6,8,10,12),
            nrow = 3,
            ncol = 2)
B
t(B)

Combining The Matrices

#Adding a column or multiple columns
cbind(A, t(B))
#Adding a row or multiple rows
rbind(A, t(B))

Arithmetic With Matrices

A * 2
A + t(B)

DATA FRAMES

In basic terms, a data frame is a two-dimensional array-like structure where:

  • Each column represents a variable and can contain data of different types (numeric, character, factor, etc.).
  • Each row represents an observation or a record.

name <- c("John", "Josh", "Mark")
age <- c(32, 29, 25)
n <- c(FALSE, TRUE, FALSE)

#using data.frame() funtion to create a data frame
df <- data.frame(name, age, n)
df

mtcars
head(mtcars)
tail(mtcars)

Investigate the structure of data frame

str(mtcars)

Selecting Values From Data Frame

#using indexes
#df[row, column]
mtcars[13,7]
#using names of columns and rows
mtcars["Merc 450SL", "qsec"]
##Getting the all values in a row with name
mtcars["Merc 450SL", ]
#Getting the all values in a column with name
mtcars[["qsec"]]
mtcars$qsec
#Select first 5 values of a column
mtcars[1:5, "qsec"]

Sorting your data frame

#using order() to sort the df
mtcars[order(mtcars$qsec), ]
#using order() to sort the df - decreasing
mtcars[order(mtcars$qsec, decreasing = TRUE), ]

Subsetting The Data Frame

#using which() funtion to subset data frame
index <- which(mtcars$qsec > 17) #this gives us indexes for given condition

mtcars[index,]

Terminal RScript


nano r_workshop.R


df <- data.frame(
  Name = c("Alice", "Bob", "Charlie", "David", "Eve"),
  Age = c(25, 30, 35, 40, 22),
  Height = c(5.5, 6.0, 5.8, 5.9, 5.7),
  Score = c(85, 90, 88, 75, 92)
)

#1. Print the original data frame
print("Original Data Frame:")
print(df)

# 2. Sort Data Frame by Age in Decreasing Order
sorted_by_age <- df[order(df$Age, decreasing = TRUE), ]
print("Data Frame Sorted by Age (Decreasing):")
print(sorted_by_age)

# 3. Sort Data Frame by Score in Increasing Order
sorted_by_score <- df[order(df$Score), ]
print("Data Frame Sorted by Score (Increasing):")
print(sorted_by_score)

# 4. Filter Data Frame to Include Only Rows with Age > 30
filtered_df <- df[df$Age > 30, ]
print("Filtered Data Frame (Age > 30):")
print(filtered_df)

# 5. Add a New Column with a Calculated Value
df$Height_in_cm <- df$Height * 100
print("Data Frame with New Column (Height in cm):")
print(df)

# 6. Calculate the Mean Score
mean_score <- mean(df$Score)
print(paste("Mean Score:", mean_score))

# 7. Find the Row with the Maximum Score
max_score_row <- df[which.max(df$Score), ]
print("Row with Maximum Score:")
print(max_score_row)


RScript r_workshop.R