Skip to content

UC Davis Distributed Computing with Spark SQL (with Databricks) and Databricks Apache Spark SQL for Data Analysts

Notifications You must be signed in to change notification settings

swilliamc/SparkSQL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SparkSQL

Distributed Computing with Spark SQL (UC Davis and Databricks)

Week1: 101 Introduction to Spark and Queries in Spark SQL

Week2: 102 Spark Core Concepts and Spark Internals

Week3: 103 Engineering Data Pipelines

Week4: 104 Machine Learning Applications of Spark and Linear Regression/Logistic Regression Classifier

Logistic Regression Classifier Machine Learning Assignment (with Python Sklearn)


Databricks Apache Spark SQL for Data Analysts

W1 Introduction

W2 Big Data and Apache Spark

W3 Spark SQL on Databricks, Data Visualization, and Exploratory Data Analysis

W4 Spark SQL Powered Queries and Spark User Interface

W5 Manage Nested Data Structure, Manipulating data, and Data Munging

W6 Higher Order Functions, Aggregating and Summarizing, Partitioning Tables, and Sharing Insights

W7 Modern Data Storage and Using Delta Lake

W8 Building and Maintaining Delta Tables, Managing records in delta table, Delta Engine Optimization

W9 SQL Coding Challenges

Releases

No releases published

Packages

No packages published

Languages