How The Boxoffice Company Recommends Movies and TV Shows to 14 million Fans on AlloCiné

The Boxoffice Company
5 min readOct 22, 2021

--

At AlloCiné, one of The Boxoffice Company’s media sites and the biggest movie website in France, we offer movie and TV recommendations to our 14 million monthly visitors. We refer to these recommendations as Affinity Scores which are generated with the help of an algorithm that estimates how much someone will like a particular movie or TV show.

i.e. How likely is each movie fan to want to see Dune?

Sounds simple enough, but a lot goes into making such an algorithm possible. In order to calculate an affinity score, we used what data scientists call collaborative filtering, a technique that groups people together based on the ratings they provide. Simply put, the affinity score displayed to an AlloCiné user for a given movie or show is calculated by the ratings of people who share similar tastes and preferences.

The affinity score displayed to an AlloCiné user for a given movie or show is calculated by the ratings of people who share similar tastes and preferences.

With AlloCiné’s vast reserve of data, we wanted to account for a wide range of tastes, make very personalized recommendations, and propose niche content we thought our users might like. Our algorithm, which is based on reviews of more than 12 thousand movies and shows, turned out to be very effective.

Why did we choose collaborative filtering?

Consider the visualization below.

If we were to create an affinity matrix whose every row represented an individual person, and every column a movie or a show, we’d end up with billions of cells. 99% of them, however, would be empty. To account for this, we’d have to fill all those empty cells with a bit of preexisting data — in our case, ratings provided by our users — which is exactly what we did.

Our collaborative filtering algorithm allowed us to determine someone’s taste with a smaller, more manageable number of characteristics.

But this created two challenges: first, how to manage thousands of variables in order to calculate each user’s affinity for every single movie or show; and second, how to limit the time it would take to run these calculations (we couldn’t store billions of pieces of information or else it would take longer than 30 seconds to load a single recommendation — a stretch most people aren’t willing to wait).

Our collaborative filtering algorithm allowed us to determine someone’s taste with a smaller, more manageable number of characteristics — a technique we call dimensionality reduction.

Over the course of the algorithm’s development, we evaluated its performance with preexisting ratings. These not only fed the model (in other words, the piece of code trained to recognize patterns required to make predictions), but also optimized the algorithm itself. Movie and TV ratings are very subjective, however, and most tend to be positive, making it difficult to determine which movies and shows are not to someone’s liking. Loopholes like these had the potential to negatively affect the algorithm’s accuracy. To mitigate this, we reached out to a group of people who were very active on AlloCiné and could provide us with solid feedback.

Our studies showed a tendency for ratings to stabilize after a movie or series had received at least one hundred ratings. At that point, a new user’s rating would have a negligible impact on its score.

Turning our focus to this group of active users allowed us to prioritize quality over quantity. After gathering viewer feedback and running myriad tests to optimize our functions and costs, we set the results side by side to determine the minimum number of ratings necessary for a viewer to see affinity scores.

We decided to have our users rate at least 20 movies and shows so we could get a sense of what they liked. Of course, at such a low number, the algorithm isn’t completely accurate, but as people provide more ratings, the algorithm becomes more precise. Having settled the required minimum of user ratings, we were then left with another question: How many ratings must a movie or series have received before its affinity score can be displayed?

Our studies showed a tendency for ratings to stabilize after a movie or series had received at least one hundred ratings. At that point, a new rating would have a negligible impact on its score. So, we decided to display affinity scores solely for movies and series that had received at least one hundred ratings.

How did we develop the algorithm?

We weren’t looking to reinvent the wheel, so we used a well-known open source framework in the data community: Apache Spark.

Secondly, we used Google Cloud platform to prototype it — cloud services like Google’s are great to increase the speed of development of Big Data products like our Affinity Score. But once we entered the production phase (i.e. began working on the final algorithm) we didn’t rely entirely on Google Cloud since hosting it on our own proved to be cheaper. A key aspect of the project was to take advantage of cloud services to speed up the process while controlling our costs.

We settled on collecting new ratings from users in real time while refreshing the algorithm only every 10 minutes.

Our last challenge was to decide how often we wanted to refresh our algorithm since collaborative filtering algorithms don’t allow for real-time updates. We settled on collecting new ratings from our users in real time while refreshing the algorithm only every 10 minutes. Despite this slight delay, once an AlloCiné user has rated at least 20 movies or shows, they’ll receive recommendation scores for any content that has received at least one hundred ratings.

Where we’re headed

We’re now working to design new personalized functionalities that, with recommendation algorithms and filtering systems, will make navigating the movie catalogue a lot simpler. In the future, we hope to offer AlloCiné users more detailed insights for the recommendations they receive as well as improved review displays as we gather more data.

Many thanks to everyone who made this possible 👏, particularly Marylise Oger, Valentin Strach, Rui Teixeira, Sebastien Caumes, and Mohamed Belmaaza.

Subscribe to this channel to keep an eye for future thought-pieces by other members of the team at The Boxoffice Company.

--

--

The Boxoffice Company
The Boxoffice Company

Written by The Boxoffice Company

Free insights, advice, and resources for movie theater operators looking to grow their audiences, build their brand, and boost ticket sales

No responses yet