Skip to content

Groups a company's news articles from Yahoo Finance and Google Finance using non-negative matrix factorization (NMF)

Notifications You must be signed in to change notification settings

edtechre/Yahoo-Finance-and-Google-Finance-NMF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This is a Python script that fetches news articles from Yahoo Finance and Google Finance RSS by stock symbol (e.g. MSFT, ORCL, USO), and groups the articles using non-negative matrix factorization (NMF).  Sometimes the groupings are wonky, but relevant themes throughout the stories do arise.  This can at least be used for a quick perusal of news articles, FWIW.

Beautiful Soup is used to extract the text from each article.  Stop words are removed and only English words are used in the word matrices for each article.  

The generated article groupings are rendered as an HTML file via a Cheetah template.  The HTML output is sent to ./output/<symbol_name>.html by default.

Usage:
gen_features.py -s <symbol> [-o <outputfile>] [-n <numfeatures>] [-i <iterations>]

Example:
python gen_features.py -s msft -n 10 -i 50

Requirements:
Beautiful Soup (http://www.crummy.com/software/BeautifulSoup/)
Cheetah (http://www.cheetahtemplate.org/)
feedparser (http://www.feedparser.org/)
NumPy (http://numpy.scipy.org/)
PyWordNet (http://osteele.com/projects/pywordnet/)

About

Groups a company's news articles from Yahoo Finance and Google Finance using non-negative matrix factorization (NMF)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages