Skip to content

kennycontreras/cassandra-ETL

Repository files navigation

cassandra-ETL

cassandra-ETL is an ETL project for an imaginary startup called Sparkify. This project create a NOSQL keyspace using Apache Cassandra.

For the purpose of the project we're going to insert the data using a CSV file with all the information that we need. This app will allows the analytic team knows what songs are users listening to.

Prerequisites

All libraries you need to install. I recommend you to use pip install

  • cassandra-driver
  • json
  • pandas
  • numpy

Getting Started

This project was created using Jupyter Notebook so first, make sure that you have all the tools to open .ipynb files

You can find all the code for the ETL pile in the file Project_1B_Cassandra.ipynb

The purpose of this project is to create tables based on 3 queries that the analytic team needs. You're going to find all the create statements, insert and select based on those 3 queries.

Feel free to ask question about the code!

About

ETL pipeline using Apache Cassandra Database

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published