In 2008, I compared the AFI Top 100 movies lists of 1998 and 2007 to the IMDb Top 250 movies list as of September 2008 and found the industry (AFI) list skewed towards older movies compared to the fan list (IMDb) and that the contents of the AFI list had only advanced by 5 years in the 9 years between the two editions.
In 2009 and 2010 I returned to the IMDb lists by taking snapshots of the list at around the same time of year as the initial analysis (late September/early October) and attempted to extrapolate some meaning from the changes in the lists’ composition over the years.
And now, in 2011, I’m adding a fourth year of IMDb data to the analysis! Will any trends emerge? Can we predict the future of the IMDb Top 250 list? Read on to find out.
Before we begin, I should state the obvious limitations to using the IMDb list as a tool for analyzing changes in movie tastes over time. The IMDb list is far from authoritative, and the potentially skewed demographics of its voters casts even more doubt on the validity of the list. Furthermore, the very exercise of assigning a single numerical rating to a movie is more than reductive; it’s borderline absurd. But put all of that aside for a moment. The IMDb list, for all its flaws, is well-known, and its ratings are accepted as being decent rough indicators of movie quality.
With that being said, let’s pick up where we left off last year, when I asserted that the IMDb list’s perceived bias towards newer movies was real, and getting worse over time. From last year’s article:
When I fired up Excel to do this analysis, I was pretty sure that I would find that the overall shift of the dataset in terms of median year would be greater than the concurrent shift in time. And I was right:
- Median Year of IDMb Top 250 List as of 9/30/2008: 1975
- Median Year of IMDb Top 250 List as of 10/18/2009: 1977 (jumps 2 years after 1 year)
- Median Year of IMDb Top 250 List as of 9/26/2010: 1981.5 (jumps 4.5 years after 1 year)
Yup, you read that right. Over the course of the last year, the median year of the IMDb Top 250 movies list increased by 4.5 years, which suggests that not only is the lists’ bias towards newer movies still present, it’s intensifying over time.
So how does the most recent sampling fit in with this trend? The list as of 10/3/2011 had a median year of 1983.5, or a jump of 2 years after 1 year. So the tilt towards newer movies wasn’t as severe as it was from 2009-2010, but it was more than what you’d expect if the movies were evenly distributed from the minimum to the maximum for each year.
Let’s see how that plays out in chart and graph form:
Year | Min | Max | Actual Median | Theoretical Median |
2008 | 1920 | 2008 | 1975 | 1964 |
2009 | 1921 | 2009 | 1977 | 1965 |
2010 | 1921 | 2010 | 1981.5 | 1965.5 |
2011 | 1921 | 2011 | 1983.5 | 1966 |
(Note: the “theoretical median” = what the median year of the list would be if the movies were evenly distributed between the oldest movie on the list and the newest.)
I thought it would be fun to make a rudimentary projection as to what would happen to this list based on the trend of the last 4 years. So I looked at the changes in differences between median year and list year: from 2008-2009, the difference reduced from 33 to 32; from 2009 to 2010, the difference reduced from 32 to 28.5; and from 2010 to 2011, the difference reduced from 28.5 to 27.5. Each change can also be stated as a reduction factor (e.g., 32/33 = .9697). Averaging these three factors produces a single average reduction factor of .9417, which, if you apply to future years, we can use to come up with the shocking prediction that the list’s median year will essentially equal the year of the list in about 70 years:
OK, calm down folks. You don’t have to be a statistician to see the problems with this approach. First, intuitively, it makes no sense. Can you imagine a top movies list in the year 2070 that has as many movies on it from all years prior to 2069 as it does for the years 2069 and 2070? Second, and more importantly, a sample size of 4 is way too small to make this sort of projection. (Also, my little trick of averaging reduction factors probably isn’t sound math, but it worked in that it produced the nice, albeit erroneous, graph you see above.)
About that sample size. Unfortunately, I’ve only been doing this for four years. Now, if only there were some way to go back in time and capture the status of the list from previous years. If only…
But wait! Such a thing exists. Some fortuitous Google searches led me to the “IMDB Top 250 History” website, which has snapshots of lists going all the way back April 1996.
Jackpot! I took additional snapshots from the same time frame, added them to the analysis, and…
…was disappointed to find that the trend totally disappeared with the expanded dataset:
Year | Min | Max | Actual Median | Theoretical Median |
1996 | 1925 | 1996 | 1986 | 1960.5 |
1997 | 1927 | 1997 | 1986.5 | 1962 |
1998 | 1925 | 1998 | 1983 | 1961.5 |
1999 | 1922 | 1999 | 1975.5 | 1960.5 |
2000 | 1922 | 2000 | 1975.5 | 1961 |
2001 | 1922 | 2001 | 1978.5 | 1961.5 |
2002 | 1922 | 2002 | 1976 | 1962 |
2003 | 1922 | 2003 | 1975 | 1962.5 |
2004 | 1922 | 2004 | 1976 | 1963 |
2005 | 1920 | 2005 | 1978.5 | 1962.5 |
2006 | 1920 | 2006 | 1974.5 | 1963 |
2007 | 1920 | 2007 | 1976.5 | 1963.5 |
2008 | 1920 | 2008 | 1975 | 1964 |
2009 | 1921 | 2009 | 1977 | 1965 |
2010 | 1921 | 2010 | 1981.5 | 1965.5 |
2011 | 1921 | 2011 | 1983.5 | 1966 |
Turns out that in its earlier days, the IMDb Top 250 list was even more skewed towards newer movies than it is today, both in relative and absolute terms. And even with the larger sample size, the swings in median year make any sort of projection unfeasible, whether it’s with my fuzzy math method or a more formal linear regression analysis. And we haven’t even factored in IMDb’s changes and tweaks to its ranking algorithm over the years.
Sorry, folks, we can’t predict the future of the IMDb Top 250 Movies List through statistics. Or even make an educated guess. But we can still have a lot of fun analyzing the changes that have occurred to the list. Read on for more:
Here’s the master distribution graph for the 2011 list, and the ones for the previous four years for comparison. The solid lines in the graphs are all 4th order polynomial trend lines.
2011
2010
2009
2008
Now, for specific changes between the 2010 and 2011 lists:
New movies on the 2011 list
Title | Year | Rank | Rating |
Drive | 2011 | 114 | 8.2 |
Harry Potter and the Deathly Hallows: Part 2 | 2011 | 133 | 8.2 |
A Separation | 2011 | 207 | 8 |
Black Swan | 2010 | 113 | 8.2 |
The King’s Speech | 2010 | 125 | 8.2 |
The Social Network | 2010 | 220 | 8 |
Shutter Island | 2010 | 238 | 8 |
Elite Squad: The Enemy Within | 2010 | 246 | 8 |
Mary and Max | 2009 | 195 | 8 |
Ip Man | 2008 | 241 | 8 |
Spring, Summer, Fall, Winter… and Spring | 2003 | 247 | 8 |
A Beautiful Mind | 2001 | 242 | 8 |
The Celebration | 1998 | 233 | 8 |
Beauty and the Beast | 1991 | 249 | 8 |
Fanny and Alexander | 1982 | 210 | 8 |
Stalker | 1979 | 243 | 8 |
Persona | 1966 | 200 | 8 |
The Man Who Shot Liberty Valance | 1962 | 237 | 8 |
Tokyo Story | 1953 | 248 | 8 |
His Girl Friday | 1940 | 250 | 8 |
The Passion of Joan of Arc | 1928 | 212 | 8 |
Sherlock Jr. | 1924 | 221 | 8 |
Where are the recent summer movies? Many of the big winners of the 2010 Academy Awards have made their way onto the 2011 list, but none of the 2010 summer fan favorites, and only one from the 2011 summer season. Compare that to the 2010 summer hits that made it onto the 2010 list: Inception, Toy Story 3, How to Train Your Dragon, and Kick-Ass. Conclusion: the 2011 summer movie season was weak. But you didn’t need statistics or the IMDb list to tell you that.
Movies on the 2010 list that dropped off the 2011 list
Title | Year | Rank | Rating |
Kick-Ass | 2010 | 195 | 8 |
Letters from Iwo Jima | 2006 | 214 | 8 |
Little Miss Sunshine | 2006 | 244 | 7.9 |
Crash | 2004 | 225 | 8 |
Mulholland Dr. | 2001 | 247 | 7.9 |
Wo hu cang long | 2000 | 240 | 7.9 |
Toy Story 2 | 1999 | 229 | 8 |
The Nightmare Before Christmas | 1993 | 241 | 7.9 |
Edward Scissorhands | 1990 | 250 | 7.9 |
The Conversation | 1974 | 233 | 8 |
Planet of the Apes | 1968 | 230 | 8 |
Bonnie and Clyde | 1967 | 223 | 8 |
Spartacus | 1960 | 242 | 7.9 |
Anatomy of a Murder | 1959 | 236 | 8 |
The African Queen | 1951 | 221 | 8 |
Harvey | 1950 | 202 | 8 |
Brief Encounter | 1945 | 217 | 8 |
Arsenic and Old Lace | 1944 | 249 | 7.9 |
Shadow of a Doubt | 1943 | 215 | 8 |
The Adventures of Robin Hood | 1938 | 246 | 7.9 |
King Kong | 1933 | 208 | 8 |
Duck Soup | 1933 | 224 | 8 |
So much for Kick-Ass. Great movie, but…Top 250 great? (Duke it out in the comments.)
Not much else to say here, except to repeat the annual ritual of lamenting the cruelty of the IMDb algorithm that pushes out classic movies to make room for new “classics.” Notably, the original Planet of the Apes has left the list in the same year that a new Planet of the Apes movie received high critical and audience praise. Fellow gorilla-themed classic King Kong is also gone, meaning that these Damn Dirty Apes have gotten their paws off of the IMDB Top 250 List…for now.
Conclusion
Looking back at the comments on these articles over the past few years, it’s clear that these lists and their content get people really worked up, in spite of all of the caveats downplaying any authority that the IMDb list may carry and the authority that any ranking list done on subjective grounds may carry.
So rather than try to tell people not to read into this too much, I’m going to give the opposite advice for those wishing to comment on this article: read into it all you want. If you think the data says something about the changing tastes of our times, the intensifying fanboy effect on IMDb rankings, the downfall of the art of film, or the downfall of Western Civilization, let it be known. If you think this is a pointless exercise and that I should be using Excel for more interesting pop culture analysis, let it be known.
Oh, and speaking of Excel, if you want to do your own analysis, you can either download my datasets for 1996-2011 or grab some for yourself at the IMDB Top 250 History site.