>>Click and Go to blogpost on Medium
In this project, I tried to analyze Airbnb Istanbul data and answer some questions about the data.
Project's scope is explaratory data analysis with visuals and maps, applying statistical test to data to answer the questiones correctly.
All the analyses are can be found on Medium Post : https://semihdesticioglu.medium.com/airbnb-istanbul-data-analysis-40f52e781dac
- Data Source
- Installation & Libraries
- About the Project
- File descriptions
- Results
- Example Visuals from Project
Thanks to InsideAirbnb for providing this data.
You can clone the repo with the code below.
- Clone the repo:
https://github.com/semihnykv/airbnb_istanbul.git
To be able to run it properly, these libraries should be installed:
- Pandas
- Scipy
- Numpy
- Seaborn
- Matplotlib
- Geopy
- Folium
- Nltk
- Wordcloud
Airbnb Istanbul/
├── airbnb_istanbul_explaratory_data_analyis.ipynb
└── data/
├── airbnb_istanbul.csv
└── reviews.csv
└── figures/
├── wordcloud.png
├── heatmap.png
└── price_ranges_map.png
└── medium_blogpost_snapshot.png
├── Licence
├── Readme
- airbnb_istanbul_explaratory_data_analyis.ipynb - Explaratory Data Analysis Notebook for Airbnb Istanbul Data.
- data/airbnb_istanbul.csv - Listing information of accommodations in Istanbul.
- data/reviews.csv - Reviews information of accommodations in Istanbul.
- figures/wordcloud.png - Wordcloud image for listing descriptions.
- figures/heatmap.png - Heatmap for listings.
- figures/price_ranges_map.png - Scatter plot on map with price range categories.
- figures/medium_blogpost_snapshot.png - Snapshot from Medium blogpost.
- Beyoglu , Sisli, Fatih, Kadikoy and Besiktas have most listing counts, these districts cover 71% of all listings.
- Fatih is the most touristic district area in Istanbul which have most of the sightseeing places and old mosques. Seeing highest hotel rate in this district is not surprising.
- Basaksehir, Bagcilar and Esenyurt has lowest “shared house rate”. In these districts home sharing seems not popular since these areas are more conservative areas.
- Uskudar, Umraniye, Maltepe and Kadıkoy have highest review value score for districts which have more than 50 listings. 95% of the daily price values distributed between 0 and 7276 TL. 103 outlier value is found with 2 std dev. distance from mean.
- Besiktas, Beyoglu, Sisli, Fatih and Kadikoy are evaluated with ANOVA test according to their price mean. At least one group has different mean price according to statistical ANOVA test.
- Continent has an effect on price according to ANOVA test. Rentals located in Europe tend to have higher daily prices. Besiktas, Beyoglu, Sisli, Fatih and Kadikoy are evaluated with ANOVA test according to their price mean. At least one group has different mean price according to statistical ANOVA test. Kadikoy seem to have lower price range, it’s a good option for tourists:)
- Distances to 8 most touristic places are calculated and minimum value is used for minimum distance feature. Correlation test show no direct correlation. If we assign categories to the locations according to their minimum distance, we can infer that at least one group’s price mean is different than others. Box-plot tells us as the distance to touristic areas increases, mean price decreases.