Skip to content

Latest commit

 

History

History
110 lines (86 loc) · 8.74 KB

CONTRIBUTING.md

File metadata and controls

110 lines (86 loc) · 8.74 KB

Introduction to Machine Learning

This repository contains the slides for the "Introduction to machine learning" course. See also the moodle page.

Contributing

We are happy about new contributors. If you contribute something, please feel free to add your name to the team.

Git-Workflow

  • Access to the master branch is protected, please make your own, issue-/task-specific branch off the master branch to work in and do a pull request once you're done.
  • Do many small, focused, single-issue commits with descriptive commit messages: each commit message should refer the issue it adresses or fixes, i.e. include something like adresses #<issuenumber>, closes #<issuenumber> or similar, where applicable.
  • We generally work based on the feature branch workflow
  • The person who merges the pull requests adds a note to the changelog if the changes are substantial

Slides

  • Notation on the slides uses latex-math. Please do read the accompanying ReadMe and clone latex-math into this repo (otherwise you will not be able to render the slides).
  • Use the commands defined there, don't define your own.
  • If you have to introduce new notation/symbols you should add it to latex-math, after doublechecking that
    • it is consistent with what we already have
    • you do not overwrite symbols we have already defined differently
  • We write slides for beginners: keep it simple, keep it short
  • We try to keep slides modular: slidesets should represent about 15-20 minutes of material and be moderately self-contained.
  • Don't put code on the slides, the theory is orthogonal to issues of implementation (... in theory..). Code is strictly for exercises/ practice sessions.
  • Compiling the slides should be done via the Makefile: just type make all in the specific folder and it will render all slidesets in the folder, or make <SLIDES>.pdf to render a specific file <SLIDES>.tex.
  • make will automatically move a copy of the compiled PDFs to the slides-pdf directory. From there, files can be copied into the course website repository in case of a new release. If you use Windows we recommend that you access make via the Ubuntu bash (take a look at the installation tips)
  • We try to keep a "dependency graph" between slide sets up to date so that it's easier to keep track of what material needs to be understood before what else. Please do add appropriate %! includes:-comments in your slides to keep this up-to-date, see also attic/slide-dependencies.R and slides/slide-dependencies.pdf.
  • We recommend usage of {tinytex} (install via tinytex::install_tinytex())
  • Use make install in the slides folder to automagically install all the R packages you'll need for the slides, demos and exercises. See also attic/install.R

Figures Used in the Slides

  • Figures not produced by us are added to the figure-man folder of the respective chapter
  • R-files which produce figures should be named fig-*.R
    • The basic assumption is that you execute the R-files from the rsrc folder
    • These figure producing R-files should save their respective figures to ../figure/. From the name of the figure it should be clear which R-file produced it.
    • If you create a new plot or change an existing plot, you need to commit your changes of the r-files as well as the corresponding pdf-files. This means in if you create a new plot, you will have to add the pdf-files with git add -f *.pdf since pdf-files are ignored in this repo by default.
    • Utility functions used by more than one R-file should be exported to a separate R-file (also located in the respective rsrc folder)
    • Heavy simulations should not be done in the figures producing R-files. Instead, we only load Rdata files which were produced by separate R-files (also located in the rsrc folder)
  • If you replace graphics with new files with a different file name, or if you remove slides with graphics in them, then make sure that you remove unused files. To check if there are unused files in a figure/ or figure_man/-folder, do the following:
    1. Make sure you are in the folder that contains the .tex-files.
    2. Run make most, which re-compiles all .pdf-files while creating a log of what files were used.
    3. Run ../../scripts/check_files_used.sh figure unused slides-*.tex to list all files in the figure/-folder that are unused.
    4. Do the same for the figure_man/-folder: ../../scripts/check_files_used.sh figure_man unused slides-*.tex.
    5. Remove the unused files from git. The easiest way to do this is to use git rm <file>, but you can also delete the file first and then "add the deletion": rm <file> followed by git add <the file that was deleted>. You can then commit.
    6. If you find that you deleted a file that should not have been deleted, you can retrieve it from the git history: through the command line or by browsing the GitHub git history.

Exercises

  • Exercises are organized chapter-wise. Each folder will contain
    • a subdirectory figure for plots,
    • a subdirectory ex_rnw that contains .Rnw files with single exercises (prefixed with ex_) and associated solutions (prefixed with sol sol_),
    • one or multiple exercise sheets (prefixed with ex_) and associated solutions (prefixed with sol_), sourcing the single snippets from ex_rnw,
    • a collection file (prefixed with collection_) that assembles all exercises for the given topic (those currently used in the exercise sheets, further existing material, ideas, URLs, ...)
  • Compiling the slides should be done via the Makefile: just type make all and it will render all exercises, solutions and collection files, or make <FILE>.pdf to render a specific file <FILE>.Rnw.
  • make will automatically move a copy of the compiled ex_ and sol_ PDFs (i.e., those that will appear on the Website) to the exercises-pdf directory. From there, files can be copied into the course website repository in case of a new release.
  • When creating new exercise sheets or collection files, please use the setup provided in style/preamble_ueb.Rnw and style/preamble_ueb_coll.Rnw.

Code Snippets

  • Please follow this style guide
  • We write code that is meant to be read/worked on by beginners:
    • simple and legible is better than complex and elegant
    • add a lot of explanatory comments
    • use base-R as much as possible
    • choose variable names and code designs to maximize legibility and comprehension

Code Demos

now in /code-demos. Originals at this link

Google Figures

Google Figures are stored in the G-Drive

Creating Lecture Videos

  • Video files should have the same name as the slide set they are narrating.
  • Our videos show the lecturer's head in the bottom right corner
  • Make sure you minimize background noise, have good lighting and do remember to switch off your phone and to sedate or expell your pets / spouses / flatmates / office co-inhabitants for distraction-free recording.
  • Make sure you record in a resolution that's high enough to easily read the slides (at least 1280 x 760, higher is better).
  • We have excellent USB-Microphones to borrow in Bernd's office
  • Many possible workflows, Fabian uses :
    • mpv /dev/video0 --framedrop=no --speed=1.01 --window-scale=0.35 --no-border --ontop for a borderless, low latency webcam window and kazam for screen capture.
    • In kazam, don't forget to
      • set preferences to "USB microphone" & set loudness fairly high
      • set the frame rate to 30

Number of slides and length of videos

The number of slides and length of videos can be found here and should be updated regularly (i.e. if a new video is published)

Website

The website is updated whenever the master branch is pushed, via the Github action Pkgdown. The website uses pkgdown via _pkgdown.yml, its pages are in \vignettes. The automatic deployment uses a "secret" (see repository settings on Github), which is a PAT called DEPLOY_PAT (created by Fabian Scheipl, Jan 30 2020).