The maximum score was 12 points. One point was given per subtask. A total score between 0 and 100% was calculated and grades set on based on the grading scale below:
- A = 92 - 100
- B = 77 - 91
- C = 58 - 76
- D = 46 - 57
- E = 40 - 45
- F = 0 - 39
Course material for GRA 4157 - (Big) Data Curation, Pipelines, and Management.
04-10-2024 - Mid-term exam (40%) 09:00 - 11:00. Room D3-141. Technical knowledge, concepts from programming with data.
07-11-2024 - The final exam (60%) is a written repor based on two group presentations (1 - 3 per group) during the semester.
Lectures will be held each Friday 12-13:45 between August 23th and November 8th. You may contact me at [email protected].
https://rl.talis.com/3/binorway/lists/4D39CD33-F47E-E95D-1F5B-0511BBC9B6BF.html
Part 1
- Basic Python lists, dictionaries and operations.
- Reading from and writing to files, flexible solutions.
- Numerical python with numpy, arrays, array slicing for vectorized computations.
- Code standards, version control and code-collaboration.
Part 2
- Working with the pandas library
- Reading data from websites
- Data visualisation
Part 3
- Cleaning data, combining data sets
- Machine learning workflows with scikit learn
- Assess machine learning models based on various assumptions on data (outliers etc)
For a given lecture, the reading gives an approximate overview of what is expected to be known after the lecture. I expect you to solve the exercises after the lecture. Each week, we start the lecture with a student presentation of a exercise of choice. Send an email to [email protected] to volunteer for an exercise. For exercises regarding pandas we refer to the w3resource (W3) https://www.w3resource.com/python-exercises/pandas/index-dataframe.php
Date | Topic | Reading | Exercises | Student presentation |
---|---|---|---|---|
Aug. 23 | Course Introduction. Python recap, lists and dictionaries. Testing. Decorators. | Sundnes: Chap 1,2,3 (and 7) | Sundnes: 2.7, 2.8, 2.9, 2.15, 2.18, 3.3, 3.6, 3.17 | |
Aug. 30 | Reading and writing to file. User input. Exceptions. More on command line arguments | Sundnes: Chap 5 | Sundnes: 4.4, 4.9, 4.10, 4.12, 4.13, 4.17, 4.23 | Yulin Vera: 2.15 |
Sep. 06 | Numerical Python and plotting | Sundnes: Chap 6 | Sundnes: 5.1, 5.2, 5.3, 5.4, 5.10, 5.12, 5.14, 5.28, 5.46, 5.54 | Shan Xu: 4.4 Bohdan: 4.23 |
Sep. 13 | Pandas | McKinney: Chap 5 | W3: DataFrames: 2.-22., 73 | Yurou 5.2 Nhung: 5.46 |
Sep. 20 | Web scraping | KcKinney: Chap 6 | W3: Pandas Performance: 1.-20. (select 5-10 exercises) + GitHub Exercies Note: Some changes were made to the exercises on 24. sept | |
Sep. 27 | Github, Pipelines, Github actions | Selena: 1 Ái Linh, Eirik: 2 Ilia: 3 Narges: 4 Johannes: 4 |
||
Oct. 1 | Q & A Mid-term 08:00 - 09:45 | Previous lectures | Room C2-055 | |
Oct. 04 | Mid-term 09:00 - 11:00 | Room D3-141 | ||
Oct. 11 | Machine learning part 1 | Project 1 | ||
Oct. 18 | Group presentations | Project 2 | ||
Oct. 25 | Machine learning part 2 | |||
Nov. 01 | Group presentations | |||
Nov. 08 | Final lecture |