Skip to content

Set‐up

Minchan Kim edited this page Apr 23, 2024 · 10 revisions

Repository: https://github.com/dsc-courses/bpd-reference

Before you start..

It is highly recommended to use GitHub Desktop! It gives a better environment in which we can see changes, push, fork, pull, and commit to a repository. Dealing with changes when your local repository is behind can be difficult because of the steps with troubleshooting (stashing, etc.), but GitHub Desktop provides a clean and understandable interface to deal with changes.

In addition, be sure to pull the repository every time before you work on it (assuming it's already been cloned). There may have been changes that may have been added, even to the file you may be planning to work on (styling, direct code, etc.), so be sure to keep your local repo up to date!

Step 1: Setting up local device for development

  1. Set global Git username: git config --global user.name "Your Name"
  2. Set your global Git email: git config --global user.email "[email protected]"
  3. Install Node.js: https://nodejs.org/en/download
  4. Clone the Repository: git clone https://github.com/dsc-courses/bpd-reference.git
  5. Navigate to the repository folder: cd bpd-reference
  6. Set Up the Repository: npm install
  7. Start the localhost development server: npm run start

Step 2: Understanding the repository

A simplified file structure of the repository:

bpd-reference/
├── .docusaurus          # Build artifacts, caches; DO NOT TOUCH
├── blog                 # NOT USED
├── build                # Production; DO NOT TOUCH
├── components           # React components for displaying bpd data types
| ├── DataFrameComponent.jsx     # DataFrame component; reads the json
| └── SeriesComponent.jsx        # Series component; reads the json
├── docs                 # Documentation markdowns; MOST OF THE WORK IS DONE HERE <-----------------IMPORTANT!
│ ├── arrays-and-numpy   # Category of method/function
│ │ ├── category.json    # json file for category; contains info of label, position, description
│ │ └── arr[].md         # Page for arr[]
├── node_modules         # All node packages that the project depends on; DO NOT TOUCH
├── src                  # Contains additional source code
│ ├── css                # Styling
│ │ ├── function.css             # Styling for functions/methods page 
│ │ └── dataframe-styles.css     # Styling for DataFrames; used in DataFrameComponent.jsx
├── static               # Images and SVGs
└── docusaurus.config.js # Configuration for Docusaurus site

As seen above, most of the work will actually be done in the bpd-reference/docs folder. Every page that is seen in the front-end is created using a .md file.

Step 3: Understanding .md file structure

Here's bpd-reference/docs/building-organizing/bpd.read_csv().md as an example:

---
sidebar_position: 2                                                        <- position in this directory on the webpage
---

import DataFrameComponent from '../../components/DataFrameComponent.jsx';  <- goes up two directories, then to the specified path
import '../../src/css/function.css';                                       <- goes up two directories, then to the specified path

<code>bpd.read_csv(filepath)</code>                                        <- function we want to show - include the default parameters

<div className='base'>
    <p><strong>Read a comma-separated values (csv) file into DataFrame.</strong></p>                           <- description of function

    <dl>
        <dt className='term'>Input:</dt>                                                                       <- inputs
        <dd className='parameter'>filepath : <em>string, path object, file-like object.</em></dd>
        <dd className='parameter-description'>Any valid string path is acceptable. The string could also be a URL.</dd>

        <dt className='term'>Returns:</dt>                                                                     <- returns
        <dd>df - DataFrame with read csv file.</dd>

        <dt className='term'>Return Type:</dt>                                                                 <- return type
        <dd>DataFrame</dd>
    </dl>
</div>

---

```python                                                                                                      <- code block
pets = bpd.read_csv('pets.csv')
pets
```

<DataFrameComponent data={'{"columns":["Species","Color","Weight","Age"],"index":[0,1,2,3,4,5,6],"data":[["dog","black",40.0,5.0],["cat","golden",15.0,8.0],["cat","black",20.0,9.0],["dog","white",80.0,2.0],["dog","black",25.0,0.5],["hamster","black",1.0,3.0],["hamster","golden",0.25,0.2]]}'} />
                               ↑
                               | DataFrameComponent object that displays DataFrame from json

For the most part, nearly every .md file that will be deployed to the website will contain the same structure. If there is a need to create more functions/methods, copy and pasting a .md file from that same folder should provide a good template in what to do.

⚠️⚠️To see changes on localhost:3000, make sure to save the file that has been worked on and reload the page to see the changes. Refer back to step 1's number 7 on running localhost.⚠️⚠️

If you need to insert a new DataFrame:

  1. Make sure you have import DataFrameComponent from '../../components/DataFrameComponent.jsx'; at the top of the .md file.
  2. Turn the BabyPandas DataFrame into a usable json string. (Use the helper method defined in the notebook df2json(df) or df.to_df().to_json(orient='split'))
  3. Copy and paste it into the DataFrameComponent in the corresponding .md file. (e.g. <DataFrameComponent data={“[INSERT JSON]”} />)

If you need to insert a new Series:

  1. Make sure you have import SeriesComponent from '../../components/SeriesComponent.jsx'; at the top of the .md file.
  2. Turn the BabyPandas Series into a pandas Series and run Series.to_json(orient='split') on the Series you want to display. (Or use the helper method defined in the notebook s2json(Series, Series_name). Series_name refer to the column name while getting the series from a DataFrame df.get(Series_name))
  3. Copy and paste it into the SeriesComponent in the corresponding .md file. (e.g. <SeriesComponent data={“[INSERT JSON]”} />)
  4. Add extra key-value pair manually of "dtype":"[INSERT TYPE]" after the 'name' key-value pair. (Refer to the SeriesComponent example in bpd-reference/docs/arrays-and-numpy/arr[].md)

Step 4: Deploying local changes to GitHub Pages

Tip: Unless you are confident in your abilities with git, please use GitHub Desktop to deal with merging, stashing, pulling, pushing, and other conflicts you may deal with.

  1. Add, commit, and push changes to repository.

⚠️⚠️IF YOU ALREADY HAVE AN SSH KEY, SKIP STEPS 2 AND 3⚠️⚠️

  1. Create SSH Key:
  • In the home directory of Terminal (cd), make a new folder (mkdir .ssh).
  • To generate an SSH key pair: ssh-keygen -t rsa -b 4096 -C "your_email.example.com"
  • When prompted to save key, press enter.
  • Insert a passphrase you will remember (you'll need it later!)
  1. Save the SSH key to GitHub:
  • After generating the key, add the public key to your GitHub account in the "SSH and GPG keys" section of your account settings.
  • Enter a title, keep the key type as Authentication Key, and paste your key you created in the above step (you can copy the key you created into your clipboard from the above step by typing this into Terminal: pbcopy < ~/.ssh/id_rsa_github.pub or pbcopy < ~/.ssh/id_rsa.pub, depending on where you store your key).
  1. Go back to bpd-reference directory through Terminal.

  2. Deploy the site: USE_SSH=true npm run deploy

  • This uses your SSH key to deploy. Enter the passphrase you set when creating the SSH key.

Troubleshooting

"address already in use :::3000"

  • This means that the 3000 port is currently already being used by a different process. Before proceeding to the next steps, make sure that the other process is dealt with (saving, etc.)
  1. Find the process using the port:
  • Linux/Mac: `lsof -i :3000"
  • Windows (Command Prompt as admin): netstat -ano | findstr :3000
  1. Kill the process: (replace PID from above step)
  • Linux/Mac: kill -9 PID
  • Windows: taskkill /PID PID /F
Clone this wiki locally