Skip to content

Dulani/hacky-hour-april-2019

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hacky Hour with Tampa R Users Group

April 2019

Hacky Hour: #TidyTuesday

Join us for an hour (or two) of shared, group hacking on the weekly Twitter phenomenon: #TidyTuesday.

Each week, the @R4DScommunity releases a new data set and data scientists across the globe practice wrangling, visualizing, and modeling data, sharing their results with the community using the #tidytuesday hashtag.

At this week's Tampa R Users Group Meetup, we'll work together in small groups or individually to create and share interesting visualizations using the TidyTuesday Anime dataset released on 2019-04-23.

Meetup Info

🗓 7pm on Tuesday, April 23, 2019
📍 Southern Brewing & Winemaking
🗺 4500 North Nebraska Avenue, Tampa
https://www.meetup.com/Tampa-R-Users-Group/events/260640070/

What to bring?

  • 🤓 Your inquisitive self

  • 💻 A laptop

  • 💾 The data (see below)

  • 📦 The tidyverse.

    install.packages("tidyverse")

Download the data before the meetup

The data set is about 95mb, so make sure you download the data before the meetup -- our meetup venue does have WiFi but it will definitely be easier to download in advance.

The TidyTuesday Anime dataset is hosted on the R4DScommunity GitHub page. You can download it from there or use our ready-made RStudio project.

Use our ready-made RStudio project

If you'd like to have a ready-to-go RStudio project, you can create a new project from this repo. In RStudio, click File > New Project... and select Version Control. Then select Git and in the final screen enter the URL

https://github.com/tampausers/hacky-hour-april-2019

and choose where you would like to save the project.

Get the data from r4dscommunity/tidytuesday

You can download the data directly from the R4DScommunity GitHub page:

tidy_anime <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-04-23/tidy_anime.csv")

About The Data

The of the following information is copied from the R4DScommunity GitHub page, feel free to go there to get more information.

Anime Dataset

This week's data comes from Tam Nguyen and MyAnimeList.net via Kaggle. According to Wikipedia - "MyAnimeList, often abbreviated as MAL, is an anime and manga social networking and social cataloging application website. The site provides its users with a list-like system to organize and score anime and manga. It facilitates finding users who share similar tastes and provides a large database on anime and manga. The site claims to have 4.4 million anime and 775,000 manga entries. In 2015, the site received 120 million visitors a month."

Anime without rankings or popularity scores were excluded. Producers, genre, and studio were converted from lists to tidy observations, so there will be repetitions of shows with multiple producers, genres, etc. The raw data is also uploaded.

Lots of interesting ways to explore the data this week!

Data Dictionary

Heads up the dataset is about 97 mb - if you want to free up some space, drop the synopsis and background, they are long strings, or broadcast, premiered, related as they are redundant or less useful.

variable class description
animeID double Anime ID (as in https://myanimelist.net/anime/animeID)
name character anime title - extracted from the site.
title_english character title in English (sometimes is different, sometimes is missing)
title_japanese character title in Japanese (if Anime is Chinese or Korean, the title, if available, in the respective language)
title_synonyms character other variants of the title
type character anime type (e.g. TV, Movie, OVA)
source character source of anime (i.e original, manga, game, music, visual novel etc.)
producers character producers
genre character genre
studio character studio
episodes double number of episodes
status character Aired or not aired
airing logical True/False is still airing
start_date double Start date (ymd)
end_date double End date (ymd)
duration character Per episode duration or entire duration, text string
rating character Age rating
score double Score (higher = better)
scored_by double Number of users that scored
rank double Rank - weight according to MyAnimeList formula
popularity double based on how many members/users have the respective anime in their list
members double number members that added this anime in their list
favorites double number members that favorites these in their list
synopsis character long string with anime synopsis
background character long string with production background and other things
premiered character anime premiered on season/year
broadcast character when is (regularly) broadcasted
related character dictionary: related animes, series, games etc.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • R 100.0%