Generates this spreadsheet, which summarizes 3000 actors appearances as either "self" or an acting role. This lets you see how often these people present themselves as themselves versus a role.
WALT HICKEY:
Mike Tyson is the undisputed king of playing characters who are also Mike Tyson: He’s appeared in seven films credited as himself, most of them since “The Hangover,” putting the former boxer in third place behind two talk show hosts who probably work for scale, Jay Leno and Larry King. FiveThirtyEight
I was curious to run the data myself, and see if it put actors into an interesting order.
- Download "actor" and "actresses" files from sources below.
- Fix encoding if necessary
- Run the
calc.awk
script to collate appearances.
> awk -f calc.awk data/actors-utf8.list | sort -rn >actors.csv
Raw files downloaded from:
- ftp://ftp.funet.fi/pub/mirrors/ftp.imdb.com/pub/
- ftp://ftp.fu-berlin.de/pub/misc/movies/database/
The file "actors.list.gz" has the values.
To convert to utf8, try:
iconv -c -f cp1255 -t utf8 actors.list >actors-utf8.list