Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix broken headings in Markdown files #27

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 19 additions & 19 deletions Lectures/Lecture4/dbpandasql.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
autoscale:true

#[fit] Databases, SQL, and Pandas
# [fit] Databases, SQL, and Pandas

#cs109, Fall 2015 (#cs109)
# cs109, Fall 2015 (#cs109)

# Rahul Dave

`[email protected]`, @rahuldave, `[email protected]`

#ANNOUNCEMENTS
# ANNOUNCEMENTS

Class in in Science Center B starting THIS thursday, 17th Sep, 2015!

Expand All @@ -27,7 +27,7 @@ Class in in Science Center B starting THIS thursday, 17th Sep, 2015!
![fit](venn.png)

---
#[fit]Data Scientist: Sexiest Job of the 21st Century
# [fit]Data Scientist: Sexiest Job of the 21st Century

>It’s important that our data team wasn’t comprised solely of mathematicians
>and other “data people.” It’s a fully integrated product group that includes
Expand All @@ -42,7 +42,7 @@ Class in in Science Center B starting THIS thursday, 17th Sep, 2015!

---

#[fit] DATA ENGINEERING
# [fit] DATA ENGINEERING

- **compute**: code, python, R, julia, spark, hadoop
- **storage/database**: git, SQL, NoSQL, HBase, disk, memory
Expand All @@ -53,7 +53,7 @@ Class in in Science Center B starting THIS thursday, 17th Sep, 2015!

---

#What kind of data storage do you need?
# What kind of data storage do you need?

- **memory**
- **disk**: what if we do not fit?
Expand All @@ -63,7 +63,7 @@ Class in in Science Center B starting THIS thursday, 17th Sep, 2015!

---

#What kind of data access do you need?
# What kind of data access do you need?

- **relational**: pandas, SQL: Postgres, sqlite, Hbase, VoltDB
- **document oriented**: MongoDB, CouchDB
Expand All @@ -72,7 +72,7 @@ Class in in Science Center B starting THIS thursday, 17th Sep, 2015!

---

#Today we'll focus on relational
# Today we'll focus on relational

- What is a relational Database?
- What Grammar of Data does it follow?
Expand All @@ -81,16 +81,16 @@ Class in in Science Center B starting THIS thursday, 17th Sep, 2015!

---

#[fit]Relational Database
# [fit]Relational Database


##_Dont say_: seek 20 bytes onto disk and pick up from there. The next row is 50 bytes hence
## _Dont say_: seek 20 bytes onto disk and pick up from there. The next row is 50 bytes hence

##_Say_: select data from a set. I dont care where it is, just get the row to me.
## _Say_: select data from a set. I dont care where it is, just get the row to me.

---

#[fit]Relational Database(contd)
# [fit]Relational Database(contd)

- A collection of tables related to each other through common data values.
- Rows represent attributes of something
Expand All @@ -102,15 +102,15 @@ Class in in Science Center B starting THIS thursday, 17th Sep, 2015!

![fit](contributors.png)

##Contributors
## Contributors

![90%, left](candidates.png) Candidates

---

![fit](scales.png)

#[fit]Scales of Measurement
# [fit]Scales of Measurement

- Quantitative (Interval and Ratio)
- Ordinal
Expand All @@ -119,7 +119,7 @@ Class in in Science Center B starting THIS thursday, 17th Sep, 2015!
[^3]: S. S. Stevens, Science, New Series, Vol. 103, No. 2684 (Jun. 7, 1946), pp. 677-680

---
#[fit]Grammar of Data
# [fit]Grammar of Data

Been there for a while (SQL, Pandas), formalized in `dplyr`[^4].

Expand All @@ -130,7 +130,7 @@ Been there for a while (SQL, Pandas), formalized in `dplyr`[^4].
[^4]: Hadley Wickham: https://cran.rstudio.com/web/packages/dplyr/vignettes/introduction.html

---
#Why bother
# Why bother

- learn hot to do core data manipulations, no matter what the system
- relational databases critical for mon-memory fits. Big installed base.
Expand All @@ -139,13 +139,13 @@ Been there for a while (SQL, Pandas), formalized in `dplyr`[^4].
---
![fit](sqlexecution.png)

#[fit]GO TO NOTEBOOK[^5]
# [fit]GO TO NOTEBOOK[^5]

[^5]: Diagram from 7 databases in 7 weeks, Pragmatic Programmers

---

#RDBMS when:
# RDBMS when:

- data structure regularity is known
- transactions are required
Expand All @@ -155,4 +155,4 @@ Been there for a while (SQL, Pandas), formalized in `dplyr`[^4].

---

#[fit]FIN
# [fit]FIN
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
# 2015

##Lectures
## Lectures

The Lecture slides up to Lecture 1 are in this repository. Just click on the Lectures Folder.

##Lab 1
## Lab 1

https://github.com/cs109/2015lab1

The git lab can be read [here](https://github.com/cs109/2015lab1/blob/master/Lab1-git.ipynb).

##HW 0
## HW 0

You can read it [here](https://github.com/cs109/2015lab1/blob/master/hw0.ipynb).

Expand All @@ -26,7 +26,7 @@ For this reason, we have included HW0 in the lab link above.

At that point, homework repositories will be created for you.

##Initial Workflow
## Initial Workflow

- read hw 0, and do the survey and installations
- then read the git lab in lab 1 linked from within.
Expand Down