Skip to content
/ docs Public

Documents that go into methodological detail regarding various statistical procedures.

Notifications You must be signed in to change notification settings

m-clark/docs

Repository files navigation

docs

This repository is essentially a list of other repos specific to documents that go into methodological detail regarding various statistical procedures, as well as docs that serve as a basis for the workshops I give. In addition, it contains templates, common images and so forth. The repos themselves house markdown/code/data etc. For a bit more description of specific documents see https://m-clark.github.io/documents.

Some of these documents are essentially like any software development, and will be at least somewhat regularly updated or are actually in current development. However, this does not mean they are incomplete, and in fact its actually likely they are in a complete state. Other documents are more or less one-offs, or at least there are no current plans to revisit. However if you happen to come across any issues with a doc, feel free to open it in the corresponding repository.

Recently updated/created, or in current/continued development

Generalized Additive Models: An introduction to generalized additive models with an emphasis on generalization from familiar linear models and using the mgcv package in R.

Introduction to Machine Learning: A gentle introduction to machine learning concepts with some application in R.

An Introduction to Text Analysis with R: Focuses on handling text in the R environment in a general sense, with specific analytical examples such as sentiment analysis, topic modeling, and more.

Bayesian Basics: A conceptual introduction to Bayesian modeling with examples using R/Stan.

Mixed Models with R: An conceptual introduction to using R for mixed models.

Structural Equation Models: An intro to the approach that attempts to avoid the typical problematic depiction in the social sciences with a notably broader view.

Data Processing and Visualization: Application heavy, covering R structures, I/O, the tidyverse, ggplot2, and more.

FastR: How to make R faster before or irrespective of the machinery used.

BigR:

  • Doc Not yet begun, mostly due to ongoing development in the R world in this arena.
  • Repo

Engaging the Web with R: Document regarding the use of R for web scraping, extracting data via an API, interactive web-based visualizations, and producing web-ready documents. It serves as an overview of ways one might start to use R for web-based activities as opposed to a hand-on approach.

No current plans for update except to fix issues

Growth vs. Mixed: Equivalence of the two approaches in a variety of settings.

Topic Modeling: A demonstration of Latent Dirichlet Allocation for topic modeling in R.

Mixed Models Overview: An overview that introduces mixed models for those with varying technical/statistical backgrounds.

Clustered Data Situations: A comparison of standard models, cluster robust standard errors, fixed effect models, mixed models (random effects models), generalized estimating equations (GEE), and latent growth curve models for dealing with clustered data (e.g. longitudinal, hierarchical etc.).

Categorical Regression Models: An overview of regression models for binary, multinomial, and ordinal outcomes, with connections among various types of models.

Latent variables and sum scores: A comparison of two common means of dimension reduction for scale scores, and why the latter can at best be as good as the former, but won't be in practice.

Mixed Models Estimation: Demonstration of mixed models via maximum likelihood and link to additive models.

Mixed Models vs. Growth Curves: A comparison of the mixed model vs. latent variable approach for longitudinal data.

Sim of the above for small sample size situations

ANOVA and Mixed Models: A non-technical document to introduce mixed models for those who have used ANOVA.

MCMC algorithms: List of MCMC algorithms with brief descriptions.

Paradoxes: Summary of Pearl’s 2014 and 2013 technical reports on some modeling situations such as Lord’s Paradox and Simpson’s Paradox that lead to surprising results that are initially at odds with our intuition.

Correlation Measures: A summary of relatively recent articles that look at various measures of dependency Pearson’s r, Spearman’s rho, and Hoeffding’s D, and newer ones such as Distance Correlation and Maximal Information Coefficient.

R for Social Science: I have little to say about this, other than it might help someone new to R.

Releases

No releases published

Packages

No packages published

Languages