This repository is where the mini data project 1 and 2 of STAT 545A class has been stored under the name of Berke UCAR. The repository mainly contains the data exploration, cleaning, tidying, model fitting, external file reading and writing for the dataset steam_games of data_teachr package. It also additionally contains the dataset selection procedure before selecting steam_games done in Milestone 1.
In Milestone 1, data exploration of different datasets have been done and one of the datasets was intended to be performed on. Milestone 1 contains the dataset graphing and exploration steps. Furthermore, it also provides guidance to the Milestone 2.
In Milestone 2, selected dataset has been studied more deeply. The summary statistics and graphing have been applied to data to answer the 4 research questions designated when the Milestone 1 was turned in. Furthermore, this milestone also include dataset tidying approaches. Model fitting and prediction approaches have been shown in this milestone for one of the research questions. Last but not least, csv writing and rds file reading and writing have been exhibited in this milestone.
This repository contains 3 folders on its higher level:
- mini-project-1: Contains the files for Milestone 1 - mini-project-1.md, mini-project-1.rmd, mini-project-1_files, README.md
- mini-project-2: Contains the files for Milestone 2 - mini-project-2.md, mini-project-2.rmd, mini-project-2_files, README.md
- output: Contains the csv and rds outputs of mini-project-2.rmd knits - output_q41.csv and output_q42.rds, README.md
In order to engage with this repository, only thing that is needed to be done is to navigate into either mini-project-1/mini-project-1.md or mini-project-2/mini-project-2.md file. This file contains the github markdown version of the originally prepared mini-project-1/mini-project-1.rmd and mini-project-2/mini-project-2.rmd files. One can see the underlying code of mini-project-1/mini-project-1.md by navigating into mini-project-1/mini-project-1.rmd and do the corresponding for 2nd milestone versions. Furthermore, the repository also contains the mini-project-1/mini-project-1_files and mini-project-2/mini-project-2_files folders, which accommodates the plots of the mini-project-1/mini-project-1.md and mini-project-2/mini-project-2.md files respectively. One can check the plots from these folders in a more detailed manner.
In order to locally run mini-project-1/mini-project-1.rmd file, one should open this file in RStudio and knit the document using the Knit button located on the navigation bar. Furthermore, each knit generates a new mini-project-1/mini-project-1.md file and mini-project-1/mini-project-1_files folder containing the mini-project-1/mini-project-1.md file's plots. So, if you need to preserve the previous run's output, you can change the .rmd file's name temporarily. This is also functional for the 2nd milestone rmd file as well. However, additionally, knitting mini-project-2/mini-project-2.rmd file also overrides the files contained in output folder. So, one needs to save the current files inside output folder to another directory to preserve them. This folder contains the answer of question 4.1 output_q41.csv and the the answer of question 4.2 output_q42.rds for MS2.
All README.md files explain the directories that are located in.