Leveraging AI in OTT and CTV Content Discovery
Today’s streaming landscape contains no shortage of content, but for many viewers, finding something to watch is becoming more difficult than ever. This so-called “paradox of choice” leaves many people stuck on an endless scroll through a provider’s EPG, overwhelmed by an abundance of options. However, emerging AI technologies now have the power to harness user data in ways that can make content discovery a much more personally relevant experience.
While generative AI-based applications are removing much of the friction from content discovery, not all of the bugs have been ironed out of this advancing technology. For instance, ethical questions persist around taking user data to enhance content discovery.
In this article, several experts from key EU streaming vendors weigh in on both the benefits and drawbacks of leveraging AI in OTT and CTV content discovery. They also speculate on how it will impact the future of personalisation.
How AI Is Already Transforming the Way Users Discover Content
Maria Ingold, AI and innovation leader at the Media League, breaks down the ways that AI-based recommendation systems are evolving into specific categories and helping to rapidly transform how users discover content. “Netflix’s 2021 paper on the use of deep learning puts it most succinctly: ‘the value of a recommender system can be measured by the increase in member retention,’ ” she says. “This is because minimising search time and maximising watch time improve viewer satisfaction. AI-based recommender systems have evolved in three eras: traditional, deep learning-, and LLM-based. Traditional typically includes content-based collaborative filtering and hybrid using machine learning.
Deep learning brought significant improvements and benefits from multimodal supplementary data but can be less interpretable. Research into LLM-based recommender systems addresses the issues in pretrained language model-based systems.”
Leading the way in AI and data-driven solutions, Glasgow-based ThinkAnalytics is transforming viewer insights, content discovery, and targeted advertising tactics. CTO and cofounder Peter Docherty highlights how the company utilises these emerging elements with its ThinkMediaAI product: “Our ThinkMediaAI platform goes beyond basic recommendations by using deep metadata, behavioural signals, and over 26 AI-driven strategies to surface content that truly resonates with the viewer. This means viewers spend less time searching and more time watching, while platforms see higher engagement and reduced churn. We have already added the ability to use generative AI and explainable recommendations to further enhance transparency and trust in the discovery process.”

ThinkAnalytics’ ThinkMediaAI product
By facilitating quicker access to pertinent information and optimised advertising that increases viewer engagement, the Amsterdam-based company Media Distillery assists TV, IPTV, and cable operators; OTT platforms; and broadcasters in creating better viewing experiences through the use of its AI technology.
“AI is transforming content discovery, making it more intelligent, efficient, and tailored to individual users,” says Martin Prins, the company’s head of product. “Things that were extremely difficult or costly to do automatically a few years back, like automatically generating high-quality synopses, titles, [or] content tags, or translating metadata to other languages, are now relatively easy to do with technologies such as LLMs, which, in turn, can greatly enhance content discovery.”
The Most Critical Types of Data for Building Effective, AI-Driven Recommendation Engines
Industry experts say that high-quality metadata is the key for building effective, AI-driven recommendation engines. Additionally, how that data is harnessed and segmented is essential for streamlining the user content discovery process.
The Berlin-based company Dataxis covers more than 50 markets, 200 nations, and over 4,000 participants, including Tier 2 and Tier 3 corporations, by tracking the TV, OTT, telecom, media, and digital industries on a quarterly basis. Ophélie Boucaud, Dataxis principal analyst, emphasises the critical importance of data quality. “The main challenge when building a strong recommendation engine is having quality data behind it,” she says. “Making sure that your content is well-tagged, with precise and clean data, is a key factor in deploying advanced discovery algorithms, and it’s the first challenge for content owners. Content tagging needs to be understood as a key aspect for the entire supply
chain of content, as it can be leveraged way beyond recommendation in the gen AI era. It can improve content accessibility, enable producers to create new content thanks to tagged inputs, and help distributors promote older titles by bringing archived content within the scope of recommendation algorithms.”
Media Distillery’s Prins agrees that high- quality metadata is a key element for AI-driven recommendations, and he discusses how its platform optimises data. “High-quality, detailed metadata is essential, as recommendation systems are only as good as the quality of the data you feed them,” he says. “Traditional metadata often only covers entire programmes, missing the nuances of specific segments or topics, especially in multi-topic programmes. By analysing content inside our Deep Content Understanding Platform, we automatically generate rich, segment-level descriptions and tags that help viewers discover exactly the moments and topics they care about. We know exactly what part of a news or sports programme is about LeBron James making 50,000 career points or a discussion about Max Verstappen’s last F1 race performance in a popular talk show. And as such, viewers can finally also discover parts of programmes inside the video services that use our solutions. As such, our customers have better utilisation of their existing content catalogues.”

Media Distillery’s Topic Distillery
“We believe the most critical data types are deep content metadata, real-time behavioural signals, entitlement, and first-party viewer data,” says ThinkAnalytics’ Docherty. “Our AI-driven platform uses enriched metadata—including mood, theme, and narrative structure—alongside user actions like search, clicks, and watch history to build dynamic, personalised profiles. This fusion enables us to deliver highly relevant recommendations, solve the cold-start problem, and drive measurable engagement and retention.”
How Collaborative Filtering and Content-Based Filtering Differ
Recommendation systems filter data in two distinct ways: collaboratively and based on content. Each approach has different benefits.
Media League’s Ingold breaks down the basics of both. “Content-based filtering uses traditional machine learning to recommend items similar to those the user explicitly rated or implicitly chose in the past by creating profiles for both the item and user,” she explains. “Collaborative filtering recommends items similar users have liked via a user-item matrix and uses either memory-based or model-
based approaches.”
Prins maintains that mixing filtering approaches is the best way to harness the benefits of each method. “The most effective streaming platforms do not pick one solution, but combine multiple approaches. In a video app, one row could feature all unseen movies with Tom Cruise, while a second row may show action-oriented movies without Tom Cruise that other viewers of Tom Cruise movies liked. A third row may show trending items (most watched) or [that are] tied to current affairs (everything related to the World Cup, Roland Garros, or the Academy Awards). There may also be curated rows handled by an editorial team.”
Docherty also sees collaborative filtering and content-based filtering as complementary techniques for an effective recommendation platform. “Collaborative filtering draws on the behaviour of similar users—for example, ‘People who watched X also watched Y.’ Content-based filtering, on the other hand, recommends content similar to what a user has already enjoyed using deep metadata like mood, theme, subject.”

Personalised recommendations for three different users on ThinkAnalytics’ platform
The ‘Cold-Start’ Conundrum
In content discovery, “cold start” refers to the difficulty recommendation systems encounter when they don’t have enough information about new users or fresh content to make precise and tailored suggestions.
Prins describes how Media Distillery works to reduce this issue through data analysis. “For new users, you could first of all highlight popular content,” he says. “Another approach we have built is offering trending topics: because we analyse thousands of TV channels in real time, we can detect what emerging topics are trending and show these to the user. For instance, related to emerging news or a sports event. For new content, we would analyse and tag it with deep metadata, so it can be recommended based on its attributes from Day 1, even if no metadata is available.”
Docherty also notes that AI can help to solve the cold-start problem and unifies discovery with targeted advertising for a seamless, monetised experience.
Ethical Concerns of Designing AI Algorithms to Leverage User Data for Personalised Content
The ways that AI uses vast amounts of personal data have raised ethical concerns. This has necessitated specific guidelines and approaches for vendors to ensure that privacy protections are adhered to.
Dataxis’ Boucaud highlights the various complexities around this issue. “Video platforms and TV broadcasters more broadly have been complying with harsher privacy and data protection laws than digital-only platforms, as they fall under different regulations,” she says. “But beyond user protection considerations, other ethical challenges have arisen, notably around the IPs that LLMs are trained on. This is especially important for free-to-air and publicly funded media houses, whose content is often freely available and could be used by competitors to train their recommendation algorithms.”
“At Media Distillery, we prioritise privacy by focusing on content analysis rather than user data wherever possible,” Prins says. “For most of our products, we generate metadata and videos by analysing premium broadcast and on-demand content on behalf of our customers, without tracking individual user data, ensuring privacy is respected. First and foremost [we do that] by having a strong legal framework to protect user privacy and data, like the GDPR in Europe. We also explore privacy-friendly personalisation, such as enabling contextual ad placement based on content analysis rather than user data. Video services can still monetise their content with ads and, as an additional benefit, can offer new value to advertisers, who can now tie their advertisements based on programme context, such as location, setting, or brands seen in the video.”
ThinkAnalytics also has systems in place to protect user data. “We prioritise ethical AI by focusing on three core principles: privacy, transparency, and fairness,” Docherty says. “Our systems are designed to avoid storing personally identifiable information, relying instead on anonymised user IDs and behavioural signals. We ensure that data use is purpose-limited, access-controlled, and time-bound when necessary. We also believe in explainable AI—giving users clarity on why content is recommended—and regularly audit our algorithms to minimise bias and promote inclusive discovery experiences.”
Prins also shares how Media Distillery makes AI-driven recommendations more transparent or explainable to users. “In 2022, we started collaborating with our customer NLZIET to provide them with automatically generated chapter markers and chapter titles for certain TV programmes that help viewers navigate to topics of interest in the video player during catch-up, similar to the content chapters feature popularised by YouTube,” he says. “When the feature was introduced in their video apps, all users were informed that this new feature was powered by AI and could contain mistakes. Viewers had (and still have) the opportunity to give their feedback on the chapters. We advise and help all our customers to raise awareness and educate their viewers, especially when data is used in a customer-facing setting and in an automated way.”
Keeping Personalisation From Becoming an Echo Chamber
Ironically, AI-driven personalisation in content discovery can become almost too good in the sense of walling off users and reducing delivery of fresh content into their searches.
Boucaud explains the complicated nuances of highly optimised personalisation. “It is bound to hit a ceiling at a certain point, especially as all the video platforms we navigate on also have competing interests when it comes to pushing certain pieces of content to their viewers. They have to figure out the right balance between highlighting their original content, including promoted titles, and keeping viewers engaged by giving them rapid access to the most relevant titles for them. Specific title launches and marketing interests will keep most platforms from becoming a full echo chamber (outside of massive content libraries like YouTube).
“Navigation is an aspect of personalisation that remains fairly untouched at this point,” Boucaud adds. “Platforms now offer a mix of linear content and VOD titles, sometimes across multiple different apps and environments. Thus, we could imagine there would be incentives to automatically adjust the UX of the platform based on the type of formats that users watch the most (and not just based on the screen they watch it from), but most platforms still prefer to manage navigation editorially. It seems that AI hasn’t brought a satisfying solution to UX personalisation so far, and people remain cautious, preferring to keep humans in the loop to avoid bothering viewers.”
Boucaud is also keeping an eye on emergent approaches to content discovery that are more dynamic than relying on user data alone. “I’ve seen European broadcasters experimenting with ‘affinity modeling,’ ” she says. “As Broadcast Video on Demand [BVOD] platforms tend to focus their efforts on developing their presence on connected TV screens to keep their relevance in the living room and their status as generalist media, recommendations can’t solely be based on individual targeting and have to cater to several people behind the screen. Working around the ‘mood’ of content pieces is a path to explore. Layering subjective aspects of the content can substantially improve the quality of the discovery experience. Analysing what is inside the content itself can help further enrich contextual recommendations, for example, by integrating subtitles into the process.”
Features and Bugs of Gen AI and LLMs in Content Discovery
Gen AI—in particular, the utilisation of LLMs—is already playing a central role in the present and future of content discovery, although Ingold argues that there are still limitations with the use of LLMs. She outlines ways these limitations are being addressed.
“Deep learning architectures use task-specific data and domain knowledge, so fail to generalise well,” Ingold notes. “Furthermore, no one model solves all personalisation needs, so maintenance and innovation become complex. While there were moves into pretrained transformer models like BERT by 2021, these early models lacked the natural-language understanding to sufficiently capture user and item data. Videoland’s early research in 2023 into recommender systems using large language models increased visibility of items not normally recommended. However, it also introduced recommendations that did not exist on the platform.
“To address these issues,” Ingold explains, “in 2025, Netflix announced its development of a foundation recommendation model. LLMs can perform a variety of tasks and use data rather than feature-engineering, thereby reducing specialised model maintenance and increasing
scalability. However, predictions require millisecond latency, which requires balancing volume of events with computational efficiency. Furthermore, the cold-start problem returns with the regular addition of new titles, which requires incremental training and use of metadata, not just interactions.” Ingold concludes, “While there are challenges to address, using a foundation recommendation model provides a data-centric scalable approach for the future of personalisation.”

Media Distillery’s Preview Distillery
Still, gen AI and LLMs work very well to meet the needs of most vendors. Prins is enthusiastic about the ways Media Distillery leverages gen AI across multiple products that its customers now use daily to improve content discovery in a fully automated manner.
“We use LLMs and multimodal language models to create video previews out of broadcast and on-demand content,” he says. “These spoiler-free clips help users quickly understand what a programme is about, making it easier to decide what to watch in our customers’ video apps. We believe that video previews will be an essential feature for many streaming services that want to improve content discovery and are looking to increase engagement. We also use LLMs for deep metadata generation, including programme descriptions, topic and ad tags, as well as chapter titles, which allows viewers to quickly navigate to their news topic of interest in a news broadcast or see their favourite sports team in a sports show. We also use LLM-powered semantic search to enable users to find content using natural language. In the future, we will also see generative AI touching other aspects of content discovery, for instance, content owners using generative AI to generate promotional assets and advertisements.”