"R/RStudio Beginner Workshop"
INSTALL
If you don't have R and RStudio, you can follow the link and install both.
https://posit.co/download/rstudio-desktop/
INTRODUCTION TO BASICS
Arithmetic with R
#An addition
10 + 7
# A subtraction
10 - 5
# A multiplication
3 * 5
# A division
(5 + 5) / 2
# Exponentiation
2^5
# Modulo
28 %% 6
Variables
x <- 17
x
BASIC DATA TYPES IN R
Numerics & Integers
-
Decimal values like 1.7 are called numerics.
-
Whole numbers like 13 are called integers. Integers are also numerics.
a <- 1.7
b <- 13
#We can check the data type with class() function
class(a)
class(b)
Logical Operations
z <- a > b
z
class(z)
Character
- Text (or string) values are called characters.
k <- "BMG Lab"
k
x <- as.character(x)
x
fname = "John"; lname = "Doe"
paste(fname, lname)
?paste
VECTORS
Vectors are one-dimension arrays that can hold numeric data, character data, or logical data. In other words, a vector is a simple tool to store data.
In R, you create a vector with the combine function c(). You place the vector elements separated by a comma between the parentheses.
#creating vectors
vector1 <- c(7, 13, 17, 20)
vector2 <- c("a", "b", "c", "d")
vector3 <- c(10, 20, 30, 40)
You can find out the lenght of vectors with lenght() funtion.
length(vector1)
#combining vectors
c(vector1, vector2)
Arithmetics with vectors
vector1
vector3
#multiply by 3
vector1 * 3
vector1 - 5
vector1 + vector3
2 * (vector1 + vector3) / 7
#using round()
round(2 * (vector1 + vector3) / 7, 2)
Vector Indexing
In R, indexing starts from 1 unlike python.
d <- c("A", "B", "C", "D", "E", "F")
d[2]
#Selecting values between indexes
d[2:4]
#Selecting values with multiple indexes
d[c(1,3,5)]
d[c(4,2,5)]
#Selecting values with logical indexing
d[c(FALSE, FALSE, TRUE, TRUE, FALSE, TRUE)]
#Removing a value from the vector
d[-3]
Naming A Vector
y <- c("Duygu", "KEREMİTÇİ")
y
names(y) <- c("Firt", "Last")
y
MATRIX
In R, a matrix is a collection of elements of the same data type (numeric, character, or logical) arranged into a fixed number of rows and columns.
You can construct a matrix in R with the matrix() function.
A <- matrix(c(5, 26, 13, 6, 7, 17),#the data
nrow =2,#number of rows
ncol = 3,#the number of columns
byrow = TRUE) #fill matrix by row
A
Naming A Matrix
rownames(A) <- c("x","y")
colnames(A) <- c("a","b","c")
A
Selecting Values with Index
#A[row, column]
A[1, 3]
#1st row
A[1, ]
#2nd column
A[,2]
Transpose The Matrix
#Creating a new matrix
B <- matrix(c(2,4,6,8,10,12),
nrow = 3,
ncol = 2)
B
t(B)
Combining The Matrices
#Adding a column or multiple columns
cbind(A, t(B))
#Adding a row or multiple rows
rbind(A, t(B))
Arithmetic With Matrices
A * 2
A + t(B)
DATA FRAMES
In basic terms, a data frame is a two-dimensional array-like structure where:
- Each column represents a variable and can contain data of different types (numeric, character, factor, etc.).
- Each row represents an observation or a record.
name <- c("John", "Josh", "Mark")
age <- c(32, 29, 25)
n <- c(FALSE, TRUE, FALSE)
#using data.frame() funtion to create a data frame
df <- data.frame(name, age, n)
df
mtcars
head(mtcars)
tail(mtcars)
Investigate the structure of data frame
str(mtcars)
Selecting Values From Data Frame
#using indexes
#df[row, column]
mtcars[13,7]
#using names of columns and rows
mtcars["Merc 450SL", "qsec"]
##Getting the all values in a row with name
mtcars["Merc 450SL", ]
#Getting the all values in a column with name
mtcars[["qsec"]]
mtcars$qsec
#Select first 5 values of a column
mtcars[1:5, "qsec"]
Sorting your data frame
#using order() to sort the df
mtcars[order(mtcars$qsec), ]
#using order() to sort the df - decreasing
mtcars[order(mtcars$qsec, decreasing = TRUE), ]
Subsetting The Data Frame
#using which() funtion to subset data frame
index <- which(mtcars$qsec > 17) #this gives us indexes for given condition
mtcars[index,]
Terminal RScript
nano r_workshop.R
df <- data.frame(
Name = c("Alice", "Bob", "Charlie", "David", "Eve"),
Age = c(25, 30, 35, 40, 22),
Height = c(5.5, 6.0, 5.8, 5.9, 5.7),
Score = c(85, 90, 88, 75, 92)
)
#1. Print the original data frame
print("Original Data Frame:")
print(df)
# 2. Sort Data Frame by Age in Decreasing Order
sorted_by_age <- df[order(df$Age, decreasing = TRUE), ]
print("Data Frame Sorted by Age (Decreasing):")
print(sorted_by_age)
# 3. Sort Data Frame by Score in Increasing Order
sorted_by_score <- df[order(df$Score), ]
print("Data Frame Sorted by Score (Increasing):")
print(sorted_by_score)
# 4. Filter Data Frame to Include Only Rows with Age > 30
filtered_df <- df[df$Age > 30, ]
print("Filtered Data Frame (Age > 30):")
print(filtered_df)
# 5. Add a New Column with a Calculated Value
df$Height_in_cm <- df$Height * 100
print("Data Frame with New Column (Height in cm):")
print(df)
# 6. Calculate the Mean Score
mean_score <- mean(df$Score)
print(paste("Mean Score:", mean_score))
# 7. Find the Row with the Maximum Score
max_score_row <- df[which.max(df$Score), ]
print("Row with Maximum Score:")
print(max_score_row)
RScript r_workshop.R