Skip to content

Project for the 2020-2021 NTUA ECE class "Advanced Databases".

Notifications You must be signed in to change notification settings

SkourtsidisGiorgos/Big_Data_NTUA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Big_Data_NTUA

Project for the 2020-2021 NTUA ECE class "Advanced Databases". Τeam members: Skourtsidis Giorgos, Fivos Kalogiannis

  • Students had to write queries using both PySpark's interfaces: RDD and SparkSQL. SQL queries had to be tested on both CSV and PARQUET files and compare differences between all the results.

  • In part B, we had to implement 2 distributed join algorithms (repartition and broadcast join) and compare the results. We also had to experiment with Spark's query join optimizer.

About

Project for the 2020-2021 NTUA ECE class "Advanced Databases".

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages