Releases: DocNow/twarc
v1.2.0
The big change in this release is that twarc will now emit tweets in extended mode by default when it is feching tweets from Twitter's REST API. This means the information previously available in .text
will now be found in .full_text
.
On the other hand, data collected with twarc's filter
and sample
commands from Twitter's streaming APIs continue to use .text
but have the additional .extended_tweet
stanza when tweets require it.
For all the grizzly details please see this announcement from Twitter. You also might be interested in reading this post from the Social Feed Manager project.
If you have data pipelines of any kind built with twarc we highly recommend that you test that things are working properly before install twarc in a production environment. If you would prefer to get data the old way please use --tweet_mode compat
. Unfortunately there is no way to change the mode for the streaming APIs.
v1.1.3
v1.1.2
v1.0.10
This is a patch release to fix a bug in hydration which was causing the hydration of a single tweet to fail (thanks @toddstoffer).
It also includes a change so that hydration respects --tweet_mode
when it is used to fetch tweets > 140 characters.
v1.0.9
This release includes support for Twitter's new extended tweet functionality that is documented here. If you collect data from the streaming API you need to know that the text of the tweet now shows up in a new location. The results from searches should still look like what you are used to unless you want to choose to use the --tweet_mode extended
option which will give you the extended tweet information.
If this sounds confusing that's unfortunately because it is. The aim is to have twarc mirror Twitter's default behavior unless it is told to do otherwise with --tweet_mode
which is also exposed on the constructor for twarc.Twarc
.
v1.0.8
This release updates the behavior of Twarc.follower_ids
and Twarc.friend_ids
to take a user_id
as well as a screen_name
.
It also includes an update to the configuration loading that makes programmatic use of Twarc easier. Now instead of your program needing to deal with figuring out what keys to give to twarc.Twarc
, the constructor will attempt to load them from the environment or from the default config file $HOME/.twarc
. So all you need to do is:
import twarc
t = twarc.Twarc()
for tweet in t.search("obama"):
print(tweet)
This release also includes a new utility utils/foaf.py
which generates a friend-of-a-friend network for a given seed user. It expresses the network as a tuple of (user1_id, user2_id)
where user1_id
is the user id for a user (natch) and user2_id
is the user id for their friend (someone they follow). I'm sorry if your hopes were up for some kind of RDF graph... If they aren't completely dashed here's how you can use it:
utils/foaf.py danbri > danbri.csv
Perhaps expressing this with output similar to utils/network.py
could be useful at some point, but this satisfied the requirements of the person (/me waves to Ernesto) who needed the data to work with in R.
v1.0.7
This release includes a small change to the setup so that installations on Windows work correctly again. Thanks @yshussain!
v1.0.6
The replies
command will now traverse up the conversation tree when used with the --recursive
option. It will also follow quotes of other tweets.
Also included in this release are two new utilities:
network.py
- reads a set of tweets and generates gexf (for Gephi), dot (for GraphViz) and a html+d3 network visualizationremove_limit.py
- a utility to remove warnings from a JSON file, if you happened to run with--warnings
and decided that you didn't want the warnings after all. Thanks to @rubeot for that.
v1.0.5
v1.0.4
Twarc now includes a replies
command and API call that attempts to use Twitter's search API to find replies to a given tweet.
% twarc replies 821474942931914752 > replies.json
You can also use it on a file of tweets:
% twarc replies tweets.json > replies.json
If you want you can also fetch replies to replies by using the --recursive
option...although it could be time consuming because of rate limits on the search API:
% twarc replies 821474942931914752 --recursive > replies.json
The logic was borrowed from a standalone script that is described in some detail here.
Finally there is a new utility included utils/gexf.py that will create a GEXF file from a file of tweets that can be loaded into Gephi. It is really just a start of something, so if you have ideas for improving it, please send them.
% twarc replies 821439203561115648 > replies.json
% python utils/gexf.py replies.json > replies.gexf
# open replies.gexf in Gephi and you'll see something like this