To upload to LeadPub:
http://help.leanpub.com/en/articles/2916385-getting-started-using-git-and-github-writing-mode
CH-01 - Base R
CH-02 - R Studio
CH-03 - Data-Driven Docs
CH-04 - Markdown
CH-05 - Learning to Program
CH-06 - [Getting Help]
CH-07 - Navigating R JAMISON
CH-08 - TI 2.0 - R as Calculator
CH-XX - Objects and Assignment?
CH-09 - Functions and Disfunctions ]
CH-10 - [Scripts]
CH-11 - Intro to Vectors
CH-12 - Data Types
CH-13 - Generating Vectors
CH-14 - Modifying Vectors
CH-15 - Logical Statements
CH-16 - Summarizing Vectors
CH-16 - Dataframes
CH-17 - Matrices and Lists
CH-18 - Inhale
CH-19 - Exhale
CH-20 - Choppy chop
CH-21 - Joiny join
what can we do with data? (create, destroy, combine, re-organize)
(groupy group) ?
- group_by + summarize
- group_by + transform + ungroup
CH-22 - Summarizing Columns
CH-23 - Summarizing Groups
CH-XX - Transform with window functions ?
By end of 080 and 090 they should be able to do most of: https://www.rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf Or push some of these to DS II (data wrangling couse)?
CH-24 - Principles of Visual Communication
CH-25 - Base R Graphics
CH-26 - Color
CH-27 - Customization
CH-28 - ggplot
CH-29 - groups
CH-30 - themes
CH-31 - web graphics
CH-32 - Shiny
CH-33 - Dashboards
- Observations vs Variables (rows vs columns)
- Vector Types
- Numeric
- Character
- Factors (ordered vs unordered)
- Logical (true/false)
- Checking Vector Types
- Built-In Vectors: e.g. LETTERS
- Generating Vectors
- Missing Values and Non-Numbers
- Empty vectors: NULL
- Defining factors, relevel()
- Recoding Values
- Find and replace
- Variable Transformations
- Vectorized addition
- Defining new vector as function of others: ifelse(), gsub(), [] <-
- Casting
- Implicit Casting (coercion)
- Set theory as categories and membership
- Logical Operators
- equal
- not equal
- greater than or less than
- opposite of
- Compound Statements: AND and OR
- Casting logical vectors
- Algebra with logical vectors
- Defining groups
- from categorical variables
- from numeric variables
- missing values as a group
- Creating data frames from vectors
- the $ operator
- Checking and changing class types
- Filter rows and select columns
- Reorder rows or columns
- CSV vs RDS formats
- Matrix
- Lists
- Building data objects:
- data.frame() vs cbind() and rbind()
- Transformations of Datasets
Data wrangling is the process of preparing data for analysis, which includes reading data into R from a variety of formats, cleaning data, tidying datasets, creating subsets and filters, transforming variables, grouping data, and joining datasets.
The goal of data wrangling is to create a rodeo dataset (clean and well-structured) that is ready for modeling and visualization.
CH 09 – Getting Data into R [ tutorial ]
- Read options
- Copy and paste from Excel
- Using rdata format
- Read from csv or tsv
- Read text files
- Import from Excel
- Import from common format (foreign package)
- Import from the web (RCurl)
- Import from GitHub
- Import from DropBox
- APIs
- Census
- Socrata
CH 10 - Saving Data [ tutorial ]
- Write options
- CSV
- R Data Sets (RDS)
- CSV vs RDS
- Tables
- RData Format
- SPSS or Stata
- Copy to Clipboard
- Copy to Excel
- Subset operator
- By index, including order / match
- By logical
- Recycling
- Subset by row -- dplyr::filter()
- Indices
- Selector Vectors
- Subset by column --- dplyr::select()
- merge and match
- join in dplyr
- inner, outer, right, left
- Counting things: sum( logical statement )
- Categorical data: tables
- Missing values
- prop.table() and margin.table()
- Numeric data: min, max, mean, summary / quantile
- Missing values
- All at once: summary + data.frame / matrix
- Creating tables of descriptives: factors vs numeric
- Table ( f1, f2 ), ftable( row.vars=c(“f1”,”f2”), col.vars=”f3” )
- Function over groups: tapply( v1, f1 ) or dplyr:: group_by() + summarise()
- Functions over levels of numeric data: tapply( v1, cut(v2) )
- tapply( v1, INDEX=list(f1,f2) or dplyr:: group_by() + summarise()
- aggregate( dat, FUN, by=f1 )
- https://cran.r-project.org/web/packages/DescTools/vignettes/DescToolsCompanion.pdf
- Ground, figure, narrative (context, subject, action)
- Tufte’s rules
- Visual tragedies
- Defining a canvas: xlim, ylim
- Adding data
- Type (point, line, both)
- Symbols
- Color
- Size
- Adding grids
- Adding axes
- Adding titles / axes labels
- Adding data labels: text()
- Margins
- Colors and color functions
- Custom fonts / math symbols
- Multiple Plots (core graphics)
- Custom graph layouts
- Grammar of graphics concept
- ggplot overview
CH 19 - R shiny [ tutorial ]
- What makes documents dynamic?
- Widgets
- input objects
- Render functions
- reactive
CH 20 - flexdashboards [ overview ]
- Principles of good dashboard design
- Layouts
- Sidebars
- Value boxes
- CSS basics
- What is R? [ video ]
- How do Packages Work?
- Tour of R Studio
- Navigation in R Studio
- Style Guides
- Data-Driven Documents [ splainer ]
- The Importance of Reproducibility
- RMD in RStudio
- Headers
- Chunks
- Knitting
- Pimp my RMD
- Assignment
- Mathematical Operators
Functions [ chapter ]
- Input-Output Devices
- Object-Oriented Coding
- Arguments
- Values
- Return
- Scripts
- Navigation (working directories, list objects, create folders)
- Reading Help Files
- Observations vs Variables (rows vs columns)
- Vector Types
- Numeric
- Character
- Factors (ordered vs unordered)
- Logical (true/false)
- Checking Vector Types
- Built-In Vectors: e.g. LETTERS
- Generating Vectors
- Missing Values and Non-Numbers
- Empty vectors: NULL
- Defining factors, relevel()
- Recoding Values
- Find and replace
- Variable Transformations
- Vectorized addition
- Defining new vector as function of others: ifelse(), gsub(), [] <-
- Casting
- Implicit Casting (coercion)
- Set theory as categories and membership
- Logical Operators
- equal
- not equal
- greater than or less than
- opposite of
- Compound Statements: AND and OR
- Casting logical vectors
- Algebra with logical vectors
- Defining groups
- from categorical variables
- from numeric variables
- missing values as a group
- Creating data frames from vectors
- the $ operator
- Checking and changing class types
- Filter rows and select columns
- Reorder rows or columns
- CSV vs RDS formats
- Matrix
- Lists
- Building data objects:
- data.frame() vs cbind() and rbind()
- Transformations of Datasets
Data wrangling is the process of preparing data for analysis, which includes reading data into R from a variety of formats, cleaning data, tidying datasets, creating subsets and filters, transforming variables, grouping data, and joining datasets.
The goal of data wrangling is to create a rodeo dataset (clean and well-structured) that is ready for modeling and visualization.
CH 09 – Getting Data into R [ tutorial ]
- Read options
- Copy and paste from Excel
- Using rdata format
- Read from csv or tsv
- Read text files
- Import from Excel
- Import from common format (foreign package)
- Import from the web (RCurl)
- Import from GitHub
- Import from DropBox
- APIs
- Census
- Socrata
CH 10 - Saving Data [ tutorial ]
- Write options
- CSV
- R Data Sets (RDS)
- CSV vs RDS
- Tables
- RData Format
- SPSS or Stata
- Copy to Clipboard
- Copy to Excel
- Subset operator
- By index, including order / match
- By logical
- Recycling
- Subset by row -- dplyr::filter()
- Indices
- Selector Vectors
- Subset by column --- dplyr::select()
- merge and match
- join in dplyr
- inner, outer, right, left
- Counting things: sum( logical statement )
- Categorical data: tables
- Missing values
- prop.table() and margin.table()
- Numeric data: min, max, mean, summary / quantile
- Missing values
- All at once: summary + data.frame / matrix
- Creating tables of descriptives: factors vs numeric
- Table ( f1, f2 ), ftable( row.vars=c(“f1”,”f2”), col.vars=”f3” )
- Function over groups: tapply( v1, f1 ) or dplyr:: group_by() + summarise()
- Functions over levels of numeric data: tapply( v1, cut(v2) )
- tapply( v1, INDEX=list(f1,f2) or dplyr:: group_by() + summarise()
- aggregate( dat, FUN, by=f1 )
- https://cran.r-project.org/web/packages/DescTools/vignettes/DescToolsCompanion.pdf
- Ground, figure, narrative (context, subject, action)
- Tufte’s rules
- Visual tragedies
- Defining a canvas: xlim, ylim
- Adding data
- Type (point, line, both)
- Symbols
- Color
- Size
- Adding grids
- Adding axes
- Adding titles / axes labels
- Adding data labels: text()
- Margins
- Colors and color functions
- Custom fonts / math symbols
- Multiple Plots (core graphics)
- Custom graph layouts
- Grammar of graphics concept
- ggplot overview
CH 19 - R shiny [ tutorial ]
- What makes documents dynamic?
- Widgets
- input objects
- Render functions
- reactive
CH 20 - flexdashboards [ overview ]
- Principles of good dashboard design
- Layouts
- Sidebars
- Value boxes
- CSS basics