Skip to content

Commit

Permalink
fixes to 3.*
Browse files Browse the repository at this point in the history
  • Loading branch information
Giovanni Colavizza authored and Giovanni Colavizza committed Jul 15, 2022
2 parents 53b80c0 + 86606f8 commit d77769c
Show file tree
Hide file tree
Showing 9 changed files with 355 additions and 1,837 deletions.
44 changes: 44 additions & 0 deletions notebooks/1.4_solution.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
This is one possible solution to the exercise contained at the end of notebook "1.4 Skills Pandas":

```python
import pandas as pd
# Read the data
csv_path = "https://dataspace.princeton.edu/bitstream/88435/dsp01jm214s28p/2/SCoData_books_v1.2_2022-01.csv"
# yes, in pandas you can read a csv file directly from a URL without having to
# download it manually first.
scodata = pd.read_csv(csv_path)
# How many records?
print(scodata.shape[0])
# Select columns
scodata_copy = scodata[['uri','format','borrow_count']]
# Drop NaN values in `format` column
scodata_copy = scodata_copy.dropna(subset=['format'])
print(scodata_copy.shape[0])
# Most borrowed
# alternative method is to use `scodata.borrow_count.max()`
times_most_borrowed = sorted(
scodata.borrow_count.value_counts().keys().to_list()
)[-1]
# Format(s) of the most borrowed documents
scodata_copy[
scodata_copy.borrow_count==times_most_borrowed
].format.to_list()
# Least borrowed
# alternative method is to use `scodata.borrow_count.min()`
times_least_borrowed = sorted(
scodata.borrow_count.value_counts().keys().to_list()
)[0]
# Format(s) of the least borrowed documents
scodata_copy[
scodata_copy.borrow_count==times_least_borrowed
].format.unique().tolist()
```
Loading

0 comments on commit d77769c

Please sign in to comment.