For many years, people have been able to use web search engines to easily find the content of interest. Unfortunately, television and video service providers usually have search functionality that doesn't live up to the same standards.
Due to a lack of metadata, user data, and a more difficult modality, most search results in video services won't go much further than a match on the title and the video description. This leads to program-level results, not individual parts of the content. With the rise of short-form video services and as a result of higher viewer expectations, it is therefore important to be able to separate the relevant content from the non-relevant parts: viewers are not interested in watching entire programs when they are looking for a specific topic, so finding the right start and length of the topic in the content is key. Doing this manually is time-consuming. Since the format of this content can change depending on the viewer's style, a flexible segmentation algorithm is vital.
Our Machine Learning Engineer, Bram Zijlstra, presented 'The right place and the right time: combining chaptering and semantic search for a novel short form UX' explaining how we built a search service for video content adaptable to any video service content.
By using speech recognition, text segmentation, topic modeling, face recognition, and text detection, we showed how to extract rich metadata out of raw audio and video data. Furthermore, we disclosed how we implemented a custom Elasticsearch plugin and semantic search service that allows us not only to find the right video based on a search query but also the exact moment where a topic starts and ends, such that people can instantly jump to the part of interest.
Did you miss our session? Contact us to discover more on the subject!
April 9, 2022