Scrap Data

In this repository i have list down python file to scrap data from twitter and Reddit (For now) which will help in Creating the dataset for various deep learning and machine learning model.

FetchTweets

This file will fetch the tweets or the media/memes associated with it. TO get Started you need a twitter developer account https://developer.twitter.com/en/apply-for-access which will provide the needed API credentials i.e. consumer_key, consumer_secret, access_token, access_token_secret.

Other than those API credentials it also required tweepy, it can be easily installed using pip install tweepy

ScrapRedditMemes

This file will scrap the memes associated with a Give Subreddits and the number of pages of that provided subreddit.
Scrapping data from reddit is on the go i.e. any API credentials is not required.

Later in this i have used pyteseract to fetch the text present on the memes (not consistent but will do the job) which can help in creating a multimodal dataset.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
FetchTweets.ipynb		FetchTweets.ipynb
README.md		README.md
ScrapRedditMemes.ipynb		ScrapRedditMemes.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scrap Data

FetchTweets

ScrapRedditMemes

About

Releases

Packages

Languages

NonlinearNimesh/ScrapingData

Folders and files

Latest commit

History

Repository files navigation

Scrap Data

FetchTweets

ScrapRedditMemes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages