Stochastic Average Gradient

Background

Mark Schmidt proposed the Stochastic Average Gradient (SAG) algorithm as a fast solver for smooth convex optimization problems on finite data sets. His C/MATLAB code implements SAG for L2-regularized logistic regression. L2-regularized logistic regression is a convex optimization problem that is explained in detail in Chapter 4 of Elements of Statistical Learning. Training a conditional random field is also a convex optimization problem that is explained in detail in Chapter 5 of An Introduction to Conditional Random Fields.

Related work

glmnet(family="binomial", alpha=0) uses coordinate descent to solve L2-regularized logistic regression.
optimx is a general function optimizer which could be used to solve L2-regularized logistic regression.

Project ideas

Fork https://github.com/tdhock/SAG and write the SAG R package:

Convert Mark's C code with "mex.h" headers to C code with "R.h" headers, for the three SAG methods (SAG, SAGlineSearch, SAG_LipshitzLS).
Convert Mark's documentation comments in C code to .Rd files, possibly generated by inlinedocs, etc.
Examples/vignettes using Mark's rcv1_train and covtype.libsvm data sets that
- show how these 3 solvers can be used,
- compare with the results of glmnet/optimx.
Tests that make sure the R package
- gets the right answer (gradient with norm close to zero).
- gets the same answer as glmnet/optimx.
Write a SAG_CRF.update function for conditional random field training
- show how these solvers can be used in CRF training,
- compare with the result of CRF::crf.update

Skills required

R package and C code development.

Mentor

Please get in touch with John Nash [email protected] and Toby Dylan Hocking [email protected] as soon as possible.

Tests

After completing your tests, please post a link to your files below.

Easy: use glmnet to fit an L2-regularized logistic regression model. Use the system.time function to record how much time it takes for several data set sizes, and make a plot that shows how execution time depends on the data set size.
Medium: create a simple R package with one function and one documentation file, and upload it to your GitHub account.
Harder: Write an R package which uses .C to interface C code.

A solution of tests

RGallery package by Eric Xin Zhou is provided as a solution for these 3 test questions.

A proposed solution to the tests

The gpuClassifieR package by Ishmael B.
A multi-class linear classifier with a gradient descent trainer implemented in R, C and CUDA was developed as a solution to the proposed tests.

Stochastic Average Gradient

Background

Related work

Project ideas

Skills required

Mentor

Tests

A solution of tests

A proposed solution to the tests

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally