Skip to content

Identifying emerging and successful technologies with LDA, KWIC, and Log-Likelihood

Notifications You must be signed in to change notification settings

aleksejhoffaerber/EmergingTechnologies

Repository files navigation

Goals and Approach

This project had the aim to apply different textual analytics techniques and algorithms in order to identify a) emerging industries and b) emerging technologies used in the aforementioned industries. The approach can be simplified as follows:

  1. Preprocessing using regexpr

  2. Benchmarking different preprocessing assumptions

  3. Corpus creation and boundary testing

  4. LDA training

  5. Identifying emerging topics by emergence analysis

  6. Verification of LDA approach with test/train KWIC analysis

  7. Using bigrams to identify specific complicated industries

A result presentation (19th of Novermber, 2019) with more details, can be downloaded here: https://github.com/aleksejhoffaerber/EmergingTechnologies/blob/master/Pitch_LDA_Emerging%20Technologies.pdf

Identification of Emerging Industries

LDA training with 75 topics and their assignment to the individual companies by gamma, led to 7 emerging topics:

Topics with a rising trend:

Selection of Emerging Topics

Antibodies

Seelction of Emerging Topics

Rare Diseases

Seelction of Emerging Topics

Trips & Transport

Seelction of Emerging Topics

Fitness

Seelction of Emerging Topics

Platforms & Integration

Seelction of Emerging Topics

Aviation

Seelction of Emerging Topics

Construction

Seelction of Emerging Topics

Nutrition Diseases

Seelction of Emerging Topics

Material Research

Seelction of Emerging Topics

New Marketing Schemes

Seelction of Emerging Topics

About

Identifying emerging and successful technologies with LDA, KWIC, and Log-Likelihood

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages