In general, I'm trying to keep code, raw data, processed data, results, and images separate. I have soft coded these directories; and only two files needs to be changed (the ones in project_logistics) to change the project directories and subdirectories.
Smaller bits of analysis that are related (or depend on previous) are collected together in a wrapper.
Min-Yang is using Rstudio to write Rmd and it's git version controling to commit/push/pull from github. It works reasonably well. So does github desktop. You will also need git installed.
The easist thing to do is create a new repository using this as a template. Here's a starting guide. Don't put spaces in the name. This will set up many, but not all of the folders.
- Open up stata and the stata do file called "/stata_code/project_logistics/folder_setup_globals.do"
- Change the line:
global myprojdir U:/this_project_directory
to your project directory.
Here are two ways to be ready to run the project.
- Modify or create your profile.do file that stata automatically runs on startup. I've put mine in c:/ado/profile.do.
add the following 2 lines
global user minyangWin
global aceprice full\path\to\folder_setup_globals.do
- Restart stata
- type "do $aceprice"
Everything is set up and ready to go.
Every time you want to work on the project in stata do this:
global user <your_user_name>
do "/stata_code/project_logistics/folder_setup_globals.do"
you will have to type in the full path for the second line.
As far as I can tell, we need these user written stata commands
If you are using Rstudio, as long as you use the "Project" feature, you shouldn't have to do much. All you need to do is open R_paths_libraries.R
and change this line of code.
my_projdir<-"path/to/project/directory"
A pair of small do files to set up folders and then make stata aware of folders.
There is sample code in "data_extraction_processing" that you can use to get deflators. This can be done with "/data_extraction_processing/wrapper_external.do". You'll need an API key to import fred. Extracting OES and QCEW data is really slow.
The code in here will do a bunch of data exploration. Violin plots take a while to run.
Code for smaller pieces of the project are all in their individual folders in "stata_code". For the most part, they produce datasets or tables in "/results/" and images in "/images/"