Skip to content

Latest commit

 

History

History
41 lines (27 loc) · 2.18 KB

day03_tta_script.md

File metadata and controls

41 lines (27 loc) · 2.18 KB

Case Study: The Trump Twitter Archive

CQPweb

Worked example

  • goal: understand Donald Trump's terminology, rhetoric and phraseology (in case he comes back …)

  • take a look at a corpus of Trump's tweets from 2009 – Jan 2021 (when he was finally banned from Twitter) to illustrate corpus linguistic research

  • simplest use: search make america great again, then explain kwic and context displays

  • step 1: select a suitable subcorpus (only original tweets, no retweets etc.)

  • step 2: lemma frequency list for selection of relevant terms ➞ not very interesting

    • option: look at hashtags (prefix #)
    • option: use POS-disambiguated lemmatisation (not available for all corpora) to filter by part of speech (suffix _N, _Z, _J, _V)
    • results are still very general high-frequency words and often not particularly characteristic
  • step 3: keyword analysis = frequency comparison against reference corpus (➞ English tweets)

    • use default settings, but change keyness measure to Log-Likelihood (or Log Ratio (conservative estimate)) and show positive keywords only (too many negative ones!)
    • compare tabular view with visualisation options, click on thank to display concordance
    • focus on salient fake and very frequent great
  • step 4: click concordance for great, randomised, context view

    • sort + frequency breakdown on 1R ➞ used quite generally with different nouns
  • step 5: click on concordance for fake

    • suspicion that it's mostly fake news confirmed by frequency breakdown on 1R (fake news 75%)
    • conclusion: fake news as a salient unit of meaning
  • step 6: query fake news (subcorpus: Originals)

    • quick look at concordance ➞ more than 900 hits
    • still need quantitative analysis to get overview
  • step 7: distribution analysis for fake news (esp. distribution across years is interesting)

  • step 8: collocation analysis for fake news reveals usage and phraseology