cassandra-ETL is an ETL project for an imaginary startup called Sparkify. This project create a NOSQL keyspace using Apache Cassandra.
For the purpose of the project we're going to insert the data using a CSV file with all the information that we need. This app will allows the analytic team knows what songs are users listening to.
All libraries you need to install. I recommend you to use pip install
- cassandra-driver
- json
- pandas
- numpy
This project was created using Jupyter Notebook so first, make sure that you have all the tools to open .ipynb files
You can find all the code for the ETL pile in the file Project_1B_Cassandra.ipynb
The purpose of this project is to create tables based on 3 queries that the analytic team needs. You're going to find all the create statements, insert and select based on those 3 queries.
Feel free to ask question about the code!