May 21, 2018

Analyzing my movie preferences

Knowing that IMDb offers a CSV export of my movie ratings history, I decided to build a tool that analyzes this export and see if I can find patterns about my movie watching that arise from that.

That tool is cinestat and my ratings are analyzed here.

The first stat that cinestat provides is just how much information there is in the dataset and the total duration of the movies watched. It was interesting to see that I haven’t even spent half a year of my life on movies, but we’re not counting TV-shows here.

Next, it appears I have a tendency of rating movies higher than the average IMDb user.
Which is interesting since I consider myself pretty picky and critical, but apparently not as much as I thought.

Speaking of the rating difference, cinestat easily provides information about what movies I found overrated or underrated. Aka, where the difference between my rating and the average rating was very negative or very positive.

Quite interestingly, I found Underworld to be the most overrated movie that I watched. I don’t know if it’s me being a Romanian and feeling insulted by their representation of vampires, but I apparently really didn’t like that movie. Looking at the rating I gave, I think I might have been too harsh on it and will need to rewatch it with fresh eyes.

The rest of the list doesn’t offer any large surprises other than Mrs Doubtfire. I do like Robin Williams quite a lot, but I did find that one too over the top. Another potential surprise is Tombstone, which was apparently pretty well liked in the IMDb community, but I found it rather dull - sorry! I do like modern westerns quite a lot, I promise.

On a more positive note, let’s talk about movies I found to be most underrated. The 1st place in the list is Transylmania, where my personal bias of being Romanian and caring about the vampire myth is most obvious. The average IMDb rating is 3.9, I rated it a 9 and it’s ok to sue me for that if you want. But there’s just no way I’m not going to enjoy a comedy which claims Romanians use blue jeans as currency while being shot in the Corvin castle.

A special shout on this list goes to the Spanish thriller Secuestrados, which is one of the most tense thrillers I’ve seen and is also technically impecable, with some amazing editing throughout the film.

Longest movies watched is a rather simple widget on cinestat and is probably mostly irrelevant. The thing is, most of the longer movies are pretty popular and have had a pretty high budget. It’s very atypical and improbable for a movie that gets to 3 hours in length to have a small budget. Therefore, the longer the movie, the higher the budget.

The list of longest movies that I watched is full of popular movies and seem to mostly fall in 2 categories:

  • Biography / History: Schindler’s List, Titanic, Pearl Harbor, The Wolf of Wall Street
  • Fantasy: The Lord of the Rings: The Two Towers, The Lord of the Rings: The Return of the King, The Hateful Eight, The Green Mile

There’s also Grindhouse which should, in my humble opinion, be considered as 2 separate titles.

Last, but not least, the list of directors of the movies I watched doesn’t provide too many surprises. Spielberg tops the list, with Tarantino, Nolan and Scorsese coming in 2nd, 3rd and 4th. The rest of the top 10 is filled with very popular names, with the only surprise being Renny Harlin, who I had shamefully not even heard of before building cinestat.

The year a movie was released does tend to weigh quite a bit in my choice of choosing what to watch. I do appreciate older flicks and their influence in the evolution of the film industry, but I feel few of them aged properly. My biggest issue with older films is the sound mixing which usually feels pretty bad and unimmerseful.

Being born in 1990, I lean towards watching films made within my lifetime, as can be seen in the following chart generated through cinestat.

The number of movies watched each year really depends on having rated a movie immediately after watching it, which I unfortunately only started doing in mid-2007. The spike of movies “watched” in 2007 is caused by my decision in March of that year to start religiously keep track of what I watch. I spent a couple of days of that month going through IMDb and rated movies that I knew for sure I had previously seen.

The inaccuracy of ratings given in that particular month is pretty annoying as it can skew the data considerably. That’s because of the abnormally large number of movies qualified as watched in that month but also because of not having had rated movies right after seeing them. This is therefore both a quantity and a quality problem that can’t be fixed.

It’s also worth noting there’s a couple of movies that I’ve seen twice but I have never updated their original rating.

The ratings I give are usually on the optimistic side, with the bell curve having a peak at a rating of 8. It’s likely the overall IMDb average rating of films is lower, but I’m very picky of what I patch so it feels pretty normal for the bell curve to be situated on the higher range of the rating scale.

My favorite insight that I could determine from the ratings export that IMDb provides is what the best duration of a movie is.

I noticed that most of the movies I watch are between 90 and 110 minutes, which seems to be a pretty regular average duration. However, the ratings trend ascending - the longer a movie is, the higher the chances I rated it higher.

It appears the sweet-spot for a movie to be of my liking is between 160 and 169 minutes (2h40m - 2h50m). Anything longer and I become impatient.

cinestat is free to use, open source and awaiting to be used!

Looking forward to hearing your thoughts, feedback and comments.

© Victor Avasiloaei 2018