IMDB has a nice feature whereby users can add keywords and genres to listed shows. They have generously made these databases available here. Can we use this to find out what movies are really about?
First off, I tried to restrict the data to movies instead of related video material. I excluded every title marked with the genres Adult, News, Reality-TV, Game-Show, Talk-Show, and Short. That leaves 409,048 shows, starting in 1894 (“Miss Jerry”, Romance) and going up to 2019 (“Timeliner”, Action, Sci-Fi). Here are the proportion of movies in each genre:
This seems to scale about with the size of display cases in the now extinct video stores, although it seems Sci-Fi ought to be bigger.
How has this changed over time? Here are the top 10 genres by year, starting with the advent of talkies in 1930:
The data has been smoothed with a 3-year rolling average. The 1930s were good for comedy and romance, suitable for dark economic times. Documentaries have certainly surged in recent years, perhaps because of cheaper cameras, and they also appear to be displacing drama. Action seems to have had its heyday in the 70s through 90s.
These genres cover most movies. What keywords are popular for them? I took all the titles in the keyword list that were marked with one of the above genres, and then stripped out the boring keywords like “name-in-title”. Some really obsessive person entered tens of thousands of those. Here are the remaining top 25:
So movies are pretty much about biology: death, sex, and family relationships. That does seem to cover a good chunk of existence. But we also want to see police, cigarette-smoking, dogs, and new-york-city in the movies. Hmmm.
How have these changed over time? Looking at the top ten keywords:
They sure liked them some murder in the 30s and 40s. That was also the heyday of police and husband-wife-relationships. Can’t imagine what the connection is. Then it looks like censorship got loosened in the mid-60s, and female-nudity, sex, and nudity took over. That lasted until about 1980, when I guess the Boomers started thinking about money instead.
What else can we compare? New-york-city made it into the top 25 – what other cities rate?
New York is in a class by itself, especially in the 30s, when >10% of all movies were set there, and thus the log scale. Then it’s Paris, London and LA, then a cluster of SF, Chicago, Rome, Berlin, and Las Vegas.
The original inspiration for all this was a comment by Ken Restivo: “Vampires were so 80s. Aliens were very 90s. Zombies are right now, but, I think, rapidly becoming passe. What’s next, I asked a friend and his 10-year-old son? “Dragons”, came the answer. It could be. A trend is waiting to be set. It just needs someone to set it.” Dragons, sadly, didn’t make it into the top 1000 keywords, but here are some others:
Zombies did trend up in the Zips, and are currently the leading form of monster. Ghosts were the favorite in the 30s, and then straight-up monsters in the 50s and 80s, along with aliens. If you find this graph too busy, Ken took some of this data and turned it into an interactive chart here .
OK, one could play with this all day. I’ll try one more – what do people do in the movies?
Doctors were big in the 30s and Zips, spies in the 60s, and prostitutes in the 30s and 70s. People seemed to do more in the 30s, but that’s perhaps an artifact of how those movies have been marked up with keywords.
Google should really do something like its N-gram engine for books for this stuff. Maybe IMDB could license it from them! People need more on-line time sinks.