Name		Name	Last commit message	Last commit date
parent directory ..
data		data
text_analysis_reddit_webapp		text_analysis_reddit_webapp
CONTRIBUTORS		CONTRIBUTORS
JobsData_TextAnalysis.ipynb		JobsData_TextAnalysis.ipynb
README.md		README.md
Text_Analysis_Reddit.ipynb		Text_Analysis_Reddit.ipynb
requirements.txt		requirements.txt
text_analysis_social_services.ipynb		text_analysis_social_services.ipynb
text_analytics_rayid.pdf		text_analytics_rayid.pdf

README.md

Text Analysis Example

Motivation and Background

Text Analysis is used for summarizing or getting useful information out of a large amount of unstructured text stored in documents. This opens up the opportunity of using text data alongside more conventional data sources (e.g., surveys and administrative data). The goal of text analysis is to take a large corpus of complex and unstructured text data and extract important and meaningful messages in a comprehensible, scaleable, adaptive and cost-effective way.

Text Analysis can help with the following tasks:

Searches and information retrieval: Help find relevant information in large databases such a systematic literature review.
Clustering and text categorization: Techniques like topic modeling modeling can summarize a large corpus of text by finding the most important phrases.
Text Summarizing: Create category-sensitive text summaries of a large corpus of text.
Machine Translation: Translate from one language to another.

In this tutorial we are going to analyze reddit posts from May 2015 in order to classify which subreddit a post originated from and also do topic modeling to categorize posts.

Dependencies

To run this notebook you will need the packages listed in requirements.txt. To install run

pip install -r requirements.txt

in the command line.

You will also need jupyter notebook and python3 installed.

Data

The data was filtered from this dataset.

To unzip the data, run gunzip ./data/RC_2015-05.json.gz

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

text-analysis

text-analysis

README.md

Text Analysis Example

Motivation and Background

Dependencies

Data

Further Resources

Files

text-analysis

Directory actions

More options

Directory actions

More options

Latest commit

History

text-analysis

Folders and files

parent directory

README.md

Text Analysis Example

Motivation and Background

Dependencies

Data

Further Resources