git-STAA-577

Slides, code, cheat sheets, and RStudio lab notebooks for "Applied Machine Learning" course for Spring 2019


Project maintained by stufield Hosted on GitHub Pages — Theme by mattgraham

GitHub Repository for STAA 577

Overview

RStudio lab notebooks, full R code, cheat sheets, resources, and ad hoc notes from “Applied Machine Learning” course Spring 2019.


Why use GitHub?

We have decided to place the course materials in a GitHub repository:

  1. to familiarize you with this widly used collaborative coding tool
  2. so that you will have access to them beyond your tenure at CSU when you venture into the official job market. Jenny Bryan and Jim Hester summarize the benefits of GitHub in this fantastic reference here:

Happy Git and GitHub for the useR

If you ever plan to use verion control with GitHub I strongly recommend reading it in detail.


Course Lab Content

Datasets for STAA 577

Cheatsheets

Previewing HTML on GitHub

Sad But True

Stu’s Looping Rules for R

  1. Always use a vectorized solution over iteration when possible, otherwise … go to #2.
  2. Use a functional. Since R is a functional language and for readability, usually of the apply() family, or a loop-wrapper function, unless …
    • modifying in place: if you are modifying or transforming certain subsets (columns) of a data frame.
    • recursive problems: whenever an iteration depends on the previous iteration, a loop is better suited because a functional does not have access to variables outside the present lexical scope.
    • while loops: in problems where it is unknown how many iterations will be performed, while-loops are well suited and preferred over a functional.
  3. If you must use a loop, ensure the following:
    • Initialize new objects: prior to the loop, allocate the necessary space ahead of time. Do NOT “grow” a vector on-the-fly within a loop (this is terribly slow).
    • Optimize operations: do NOT perform operations inside the loop that could be done either up front of applied in a vectorized fashion following the loop. Enter the loop, do the bare minimum, then get out.

https://github.com/topepo

Modeling Framework (thx Max Kuhn)

Memory Usage and rsample:

The rsample package is smarter than you might think.

Vignettes

What is the Tidyverse?

Information about the:


Created on 2019-01-27 by Rmarkdown (v1.11) and R version 3.5.2 (2018-12-20).