How to make AI Startup worth over $30M l Twelve Labs Jae Lee

6 Mar 202412:34

Summary

TLDRJae, the co-founder and CEO of Twelve Labs, shares the inspiring journey of building an AI research and product company focused on developing video foundation models. He narrates how the team overcame challenges, participated in a prestigious competition, and secured partnerships with industry giants like Nvidia. Twelve Labs aims to empower developers and enterprises with cutting-edge AI models that can deeply understand video content, enabling applications like semantic search, classification, and summarization. With a 'video-first' ethos, the company tackles the massive problem of making sense of the vast video data in the world, impacting sectors like law enforcement and media.

Takeaways

😄 Twelve Labs is an AI research and product company building video foundation models that can understand videos like humans, serving developers and enterprises via APIs.
🔍 Their models aim to map human language to video content, enabling capabilities like semantic search, classification, and summarization of videos.
🏆 Participating and winning a major video understanding competition helped Twelve Labs gain exposure and attract customers and investors.
💻 The founders started the company while still in the military, working on laptops at a bagel shop during their free time.
🌟 The company's 'secret sauce' is going head-on with the video understanding problem instead of reframing it as language or image understanding.
📈 Twelve Labs has over 20,000 developers actively using their search API, with millions of monthly API calls and rapid growth in enterprise adoption.
🎯 Setting an incredibly ambitious goal and having the determination to solve hard problems is crucial for founders to achieve impactful outcomes.
🛡️ Building a moat by gathering unique data and fine-tuning smaller models, instead of relying solely on foundation models, is important for long-term success.
📢 Effective communication and the ability to explain technical products to non-technical audiences is vital for widespread adoption and impact.
🔭 The founders believe AI will present amazing opportunities for tech and humanity, and all products will be impacted by AI in the future.

Q & A

What is the main goal of Twelve Labs?
-Twelve Labs aims to build massive AI models that can understand videos like humans and provide video understanding capabilities to developers and enterprises through APIs for tasks like semantic search, classification, and summarization.
How did Twelve Labs start, and what was the founding story?
-Twelve Labs was founded by three co-founders who were serving in the Korean Cyber Command. They would meet during their weekends and work on their ideas before they were all discharged. The founding story was quite challenging as they had to coordinate their efforts while still in the military.
What was the significance of Twelve Labs participating in the ICC competition?
-Participating in the ICC (International Conference on Computer Vision) competition for video understanding helped Twelve Labs gain exposure and recognition from potential customers and investors interested in multimodal AI and video understanding.
How does the Twelve Labs AI model work, and what data is used for training?
-The Twelve Labs model aims to map human language to video content, enabling capabilities like search, classification, and summarization. It is trained on large amounts of video data, with the help of data partners who provide labeled data and licensed content in a copyright-friendly manner.
What is the current status of Twelve Labs' product adoption and usage?
-As of June 2023, Twelve Labs had soft-launched their search API, which is actively used by over 20,000 developers and has crossed a couple million monthly API calls. The company is also seeing adoption from enterprise customers, including large creators, media/entertainment organizations, sports organizations, and law enforcement agencies.
What advice does the CEO of Twelve Labs give to founders and technical product builders?
-The CEO advises founders to be patient and able to explain their technology and its impact to different audiences. He also emphasizes the importance of building a moat and not relying too heavily on foundation models, as well as setting ambitious goals and having the determination to solve incredibly hard problems.
What was the approach taken by Twelve Labs in building their video understanding technology?
-Twelve Labs took a "video-first" approach, building their machine learning pipeline and systems specifically for handling videos from the ground up, instead of reframing the problem into other domains like language or image understanding.
How did the partnership with NVIDIA come about for Twelve Labs?
-Jensen Huang, the CEO of NVIDIA, seemed to have a special interest in computer vision and video understanding, which was one of the first use cases for NVIDIA chips. NVIDIA's venture team reached out to Twelve Labs, seeing a perfect match between their vision and what Twelve Labs was doing in video understanding.
What are some of the mission-critical use cases mentioned for Twelve Labs' products?
-One use case mentioned is digital evidence management for police departments, where Twelve Labs' technology can help search for specific evidence in body cam footage quickly and generate police reports more efficiently, reducing time spent by up to 40%.
What is the CEO's perspective on the impact of AI and the importance of staying ahead of the technology curve?
-The CEO believes that AI will present amazing opportunities not only for tech but also for humanity. He emphasizes the importance of learning more about the technology and discerning trends to build a moat, as technology advancements like foundation models can potentially impact businesses relying too heavily on them.