Twelve Labs Earns $50 Million Series A to Build the Future of Multimodal AI and Video Understanding

2024.06.05

・

다코스

Twelve Labs, the video understanding company, raised $50 million in Series A funding to fuel the ongoing development of its industry-leading foundation models dedicated to all aspects of video. The round was co-led by new investor New Enterprise Associates (NEA) and NVentures, NVIDIA’s venture capital arm, which recently participated in Twelve Labs’ strategic round. Previous investors, including Index Ventures, Radical Ventures, WndrCo, and Korea Investment Partners also joined the round. In addition to R&D, funds will be used to nearly double headcount.

Twelve Labs has integrated a number of NVIDIA frameworks and services within its platform, including the NVIDIA H100 Tensor Core GPU and NVIDIA L40S GPU, as well as inference frameworks such as NVIDIA Triton Inference Server and NVIDIA TensorRT. Twelve Labs is also exploring product and research collaborations with NVIDIA to bring best-in-class multimodal foundation models and enabling frameworks to market.

Twelve Labs was created specifically for multimodal video understanding. Its release of Marengo-2.6 model, a state-of-the-art multimodal embedding model, is unlike anything currently available to companies. Marengo 2.6 offers a pioneering approach to multimodal representations tasks– not just to video but also image and audio, performing any-to-any search tasks, including Text-To-Video, Text-To-Image, Text-To-Audio, Audio-To-Video, Image-To-Video, and more. This model represents a significant leap in video understanding technology, enabling more intuitive and comprehensive search capabilities across various media types.

Twelve Labs also opened its beta of Pegasus-1, which sets a new standard in video-language modeling. Pegasus-1 is designed to understand and articulate complex video content, transforming how we interact with and analyze multimedia. It can process and generate language from video input with exceptional accuracy and detail. To get there, the Twelve Labs team drastically reduced the model’s size, from 80 billion parameters to 17 billion, with three components jointly trained together: video encoder, video-language alignment model, language decoder. Twelve Labs will release additional flagship Pegasus models in the coming months for organizations that can support larger models.

Twelve Labs introduced its Embeddings API, which gives users direct access to the raw multimodal embeddings that power the existing Video Search API and Classify API. This first-of-its-kind API supports all data modalities (image, text, audio, and video), turning data into vectors in the same space, without relying on siloed solutions for each modality.

Its new Embeddings API is powered by the Twelve Labs’ video foundation model and inference infrastructure, which are fundamentally different from those that process images one-by-one and stitch them together. By providing native support for multimodality in a single API, Twelve Labs can offset the large volume of assets models need to understand with low latency.

“Through our work, particularly our perceptual-reasoning research, we are solving the problems associated with multimodal AI. We seek to become the semantic encoder for all future AI agents that need to understand the world as humans do,” said Jae Lee, co-founder and CEO of Twelve Labs. “With our Series A funding, we can invest into further research and development, hire aggressively across all roles, as well as to extend our reach and continue building partnerships with the most innovative, forward-thinking companies in existence to eliminate the boundaries of video understanding.”

Since debuting its platform, Twelve Labs has 30,000 users that are utilizing its APIs for tasks such as semantic video search and summarization across notable organizations in sports, media and entertainment, advertising, automotive, and security. In doing so, the company has started establishing deep industry partnerships and integrations with companies like Vidispine, EMAM, Blackbird, and more.

다코스

응답

[Korean Startup Weekly News #21] Twelve Labs Raises $50 Million Series A to Build the Future of Multimodal AI and Video Understanding – 와우테일(WOWTALE)

2024.06.09

[…] Twelve Labs Earns $50 Million Series A to Build the Future of Multimodal AI and Video UnderstandingTwelve Labs raised $50 million in Series A funding to develop its video understanding models and multimodal AI. The funding will support R&D, hiring, and collaborations with NVIDIA. […]

로그인 하여 답글 남기기

답글 남기기 응답 취소

댓글을 달기 위해서는 로그인해야합니다.

[공지] 제8회 와우데이(8/27)에 투자자 여러분을 초대합니다(팟캐스트)

[공지] 와우넥스트 1기 데모데이(6/25)에 초대합니다

[공지] 와우레터와 인터뷰를 팟캐스트로 만나세요

[공지] 제6회 와우데이(4/29)에 투자자 여러분을 초대합니다

[공지] 와우테일이 창업 지원 프로그램의 홍보 파트너가 되겠습니다

[공지] ‘와우투게더’ 인터뷰에 기술창업팀-투자자를 모십니다(~3/31)

[공지] 제5회 와우데이(WOW DAY)에 투자자를 모십니다(2/25 오후3시)

[공지] WOW NEXT 1기 모집.. “기술 창업팀의 투자자 접근성 높인다”(~2/16)

[공지] 와우파트너스가 투자에 나섭니다

[공지] 스타트업뉴스 ‘와우레터’ 1000호를 축하해 주세요

[공지] 영문기사 1,000개 돌파

[공지] 행사(Event) 페이지 베타 오픈… 데모데이를 알려주세요

[공지] 와우테일 사이트 개편

[공지] 제4회 WOW IR Day(8/13)에 여러분을 초대합니다

[공지] 제3회 WOW IR Day 개최(6/27)

[공지] 제2회 WOW IR Day 개최(5/23)

[공지] 제1회 ‘스타트업 투게더 IR’ 개최(4/19)

[공지] 투자 혹한기, 스타트업 투자유치 고군분투기(12/13 오후 4시 라이브)

[채용 공고] 와우테일 에디터를 모십니다

[공지] 넥시드 투자유치 역량강화 프로그램 참가사 모집(~9/24)

Twelve Labs Earns $50 Million Series A to Build the Future of Multimodal AI and Video Understanding

관련 기사

기사 공유하기

다코스

응답

답글 남기기 응답 취소

많이 본 기사

(주)와우파트너스