This workshop will help you get data from csv into MotherDuck, and lay out basic patterns for using dbt + MotherDuck in a performant way.
- Create a MotherDuck account.
- Create a database called "stocks_dev" inside of MotherDuck. This can be done with the command
create database stocks_dev;
from the MotherDuck UI. - Fork the
matsonj/stocks
repo in GitHub. - Generate an access token inside of MotherDuck and add it as a codespace secret inside of GitHub.
- Open a codespace on the repo.
- After it loads completely, reload the window in order to make sure the dbt power user extension has access to your md environment.
- Get data from the yahoo finance api by running the following 3 commands:
python3 scripts/get_info.py
python3 scripts/get_options.py
python3 scripts/get_stock_history.py
- Build the data warehouse with
dbt build
in the CLI. - Lastly, plot the results using
python3 viz/line_chart.py
. The webpage will be available at127.0.0.1:8050
. - Alternatively, you can invoke these 3 steps with
make run
.
- Data is extracted from yahoo finance API using python. The scripts run and write out a file to
data
folder with the timestamp in the name for each file.symbols.txt
contains the list of symbols for which to fetch data.get_info.py
gets the company information for each company.get_options.py
gets the currently open options. note: this data is temporal, and thus needs to be snapshotted. This is left as an exercise to the reader.get_stock_history.py
gets the stock price history for the last 30 days.
- dbt creates a list of these files in
files.sql
with the Duckdbglob
function. - for each model -
company_info.sql
options.sql
stock_history.sql
- de-duplicate and load any new files. - for the models in step 3, test to make sure that the primary key is unique.
- create a dataset of closing stock price X outstanding shares over time to estimate Market Cap.
- Plotting is defined in the
viz/line_chart.py
file. It is a set of simple charts usingplotly
anddash
. - You can serve the plots with
python3 viz/line_chart.py
.
- In order to take advantage of the
dbt power user
plugin, you will need to put yourMOTHERDUCK_TOKEN
in your bash profile. Otherwise, all interactions withdbt power user
will hit the login page for MotherDuck.