This is the best summary I could come up with:
The San Francisco-based startup trains AI models to — as co-founder and CEO Jae Lee puts it — “solve complex video-language alignment problems.”
Lee says that Twelve Labs’ technology can drive things like ad insertion and content moderation — for instance, figuring out which videos showing knives are violent versus instructional.
It can also be used for media analytics, Lee added, and to automatically generate highlight reels — or blog post headlines and tags — from videos.
Beyond MUM, Google — as well as Microsoft and Amazon — offer API-level, AI-powered services that recognize objects, places and actions in videos and extract rich metadata at the frame level.
Since launching in private beta in early May, Twelve Labs’ user base has grown to 17,000 developers, Lee claims.
“It’s fuel for ongoing innovation, based on our lab’s research, in the field of video understanding so that we can continue to bring the most powerful models to customers, whatever their use cases may be … We’re moving the industry forward in ways that free companies up to do incredible things.”
The original article contains 731 words, the summary contains 178 words. Saved 76%. I'm a bot and I'm open source!