Dialogue with Hedra founder Michael Lingelbach: How can generative video leverage memes to create the next trend?

Original Title: Why AI Characters & Virtual Influencers Are the Next Frontier in Video ft Hedra's Michael Lingelbach

Host: Justine Moore, Matt Bornstein, a16z

Guest: Michael Lingelbach

Compiled & Edited by: Janna, ChainCatcher

Editor's Note

Michael Lingelbach is the founder and CEO of Hedra. He was a PhD student in computer science at Stanford University and also a stage actor, combining his passion for technology and performance to lead Hedra in developing industry-leading generative audio and video models. Hedra is a company focused on full-body expression and dialogue-driven video generation, with technology that supports a wide range of applications from virtual influencers to educational content, significantly lowering the barriers to content creation. This article is compiled from the a16z podcast, focusing on how AI technology is transitioning from viral meme content to enterprise-level applications, showcasing the innovative potential of generative audio and video technology.

The following is the dialogue content, compiled and edited by ChainCatcher (with omissions).

TL&DR

AI is seamlessly bridging consumer and enterprise scenarios, exemplified by technology generating baby ads promoting enterprise software, highlighting companies' enthusiasm for embracing new technology.
Viral meme content has become a powerful tool for startups, such as the "baby podcast" rapidly increasing brand awareness, showcasing clever market strategies.
Full-body expression and dialogue-driven video generation technology fills a creative gap, significantly reducing the time and cost of content production.
Virtual influencers like John Lawa shape unique digital characters through the "Moses Podcast," giving content distinct personality and appeal.
Content creators like "mom bloggers" leverage technology to quickly produce videos, easily maintaining brand engagement and audience connection.
Real-time interactive video models open up two-way dialogue with virtual characters, bringing immersive experiences to education and entertainment.
Character-centric video generation technology emphasizes personal expression and multi-agent control, meeting the demands of dynamic content creation.
A platform strategy that integrates dialogue, action, and rendering creates a smooth generative media experience, catering to the demand for high-quality content.
Interactive avatar models support dynamic adjustments to video emotions and elements, signaling the next wave of innovation in content creation.

(1) The AI Fusion from Meme to Enterprise Application

Justine: We find the cross-application of AI in consumer and enterprise scenarios very interesting. A few days ago, I saw an ad text generated by Hedra in Forbes, featuring a talking baby promoting enterprise software. This indicates that we are in a new era where companies are rapidly embracing AI technology, showing great enthusiasm.

Michael: As a startup, our responsibility is to draw inspiration from consumer user signals and transform them into next-generation content production tools that enterprise users can rely on. In recent months, some viral content generated by Hedra has garnered widespread attention, from early anime-style characters to the "baby podcast," and even this week's trending topic—I'm not quite sure what it is. Memes are a very effective marketing strategy, quickly capturing user minds by reaching a large audience. This strategy is becoming increasingly common among startups. For example, another company invested by a16z, Cluey, gained significant brand recognition through viral dissemination on Twitter. The essence of a meme is that technology provides a vehicle for people to quickly unleash their creativity, and short video content has dominated cultural consciousness. Hedra's generative video technology allows users to turn any idea into content in seconds.

(2) Why Creators and Influencers Choose Hedra

Justine: Can you explain why people use Hedra to create memes, how they use it, and how this relates to your target market?

Michael: Hedra is the first company to deploy full-body expression, dialogue-driven generative video models at scale. We support users in creating millions of pieces of content, and our rapid popularity stems from filling a critical gap in the content creation tech stack. Previously, creating generative podcasts, animated character dialogue scenes, or singing videos was very difficult—either too costly, lacking flexibility, or too time-consuming. Our model is fast and cost-effective, thus fostering the rise of virtual influencers.

Justine: Recently, CNBC published an article about Hedra-driven virtual influencers. Can you provide some specific examples of how influencers are using Hedra?

Michael: For instance, renowned actor John Lawa (who played Taco in "The League") has created a series of content ranging from the "Moses Podcast" to the "Baby Podcast" using Hedra, and these characters now have unique identities. Another example is Neural Viz, which has built a character-centric "metaverse" based on Hedra. Generative performance differs from pure media models; it requires injecting personality, consistency, and control into the model, which is particularly important for video performance. Therefore, we see the unique personalities of these virtual characters becoming popular, even though they are not real people.

(3) Virtual Influencers and Digital Avatars

Matt: I've seen many Hedra videos on Instagram Reels, featuring both newly created characters like the aliens in the Neural Viz series—previously only achievable by Hollywood blockbusters—and real people using these tools to expand their digital presence. Many influencers or content creators do not want to spend time dressing up, adjusting lighting, or applying makeup every time. Hedra allows groups like "mom bloggers" to quickly generate videos to convey messages without spending a lot of time preparing. For example, they can directly generate content talking to the camera using Hedra.

Michael: That's an important observation. Maintaining a personal brand is crucial for content creators, but staying online 24/7 is very challenging. If a creator pauses updates for a week, they may lose followers. Hedra's automation technology significantly lowers the barriers to creation. Users can generate scripts using tools like Deep Research, then create audio and video content through Hedra, automatically publishing it to their channels. We see an increasing number of workflows around autonomous digital identities, serving not only real people but also entirely fictional characters.

(4) The Potential and Challenges of Interactive Video

Justine: Many historical videos are trending on Reels now. In the past, we gained knowledge through reading history books, but that can be a bit dull. If history could be narrated through characters and showcased in generative video scenes, the experience would be much more engaging.

Michael: While we do not directly target the education sector, many educational companies are developing applications based on our API. The engagement of video interactivity is far higher than that of text. We recently launched a real-time interactive video model, the first product to achieve low-latency audio and video experiences. From language learning to personal development applications, when the cost of technology is low enough, it will fundamentally change how users interact with large language models (LLMs). One of my favorite projects is "Chat with Your Favorite Book or Movie Character." For example, you can ask, "Why did you walk into that dark room knowing there was a killer?" This interactive experience is richer than traditional audiobooks because users can ask questions and revisit content, making the experience more vivid.

Justine: The search space for video models is vast. Generating a single frame image is already complex, but generating 120 frames of continuous video is even more challenging. Hedra focuses on a unique and meaningful problem, distinguishing itself from other video models. Please describe the definition of this problem and where your inspiration comes from.

Michael: That's a great question. We see specialization emerging in the foundational model layer, just as Claude has become the benchmark for programming models, OpenAI provides a general assistant, and Gemini serves enterprise scenarios due to cost-effectiveness and speed. Hedra has a similar positioning in the video model space. Our foundational model performs exceptionally well, especially the next-generation model, providing great flexibility for content creation. However, we are more focused on how to make content "come alive," encouraging users to interact with it and feel a consistent personality and appeal. The core lies in how to combine the intelligence of characters in videos with the rendering experience. My vision is for users to communicate bidirectionally with characters in videos, where characters possess programmable unique personalities. This requires vertical integration, not only optimizing the core model but also rethinking the future experience of user interaction.

(5) Character-Centric Video Models and Subject Control

Michael: I come from a theatrical background; although I am not a professional actor, I am passionate about character performance. Video is central to our daily interactions, whether in advertising, online courses, or Hedra-driven faceless channels, and the sense of connection is crucial. By lowering the barriers to creation and speeding up the process, we enable ordinary users to easily generate content. In the future, the boundaries between model intelligence and rendering will gradually blur, allowing users to converse with systems that understand their intentions. We view characters as the core unit of control, not just video. This requires collecting user feedback to optimize the realism and expressiveness of characters while providing control levers for multiple agents.

Matt: I have spent a lot of time creating characters for different videos, and the strength of Hedra lies in its integrated character creation tools. You can create or upload character images, save them for later use, and even change contexts or clone voices. Many of my YouTube videos and tutorials use a cloned version of my voice generated by Hedra for their openings. This integrated experience is particularly valuable in the fragmented generative media market.

(6) Building an Integrated Generative Media Platform

Justine: Many companies like Black Forest Labs have made technological breakthroughs but still need partners like Hedra to deliver experiences to consumers and enterprise users. How did you decide to build an integrated platform rather than focusing on a single technology?

Michael: It’s about focus and user needs. When I founded Hedra, I found it very difficult to integrate dialogue into media. In the past, users creating short videos needed to overlay lip sync, lacking a sense of wholeness. Our technological inspiration is to unify signals like breathing and gestures with dialogue to create a more natural video model. From a market perspective, we observed differences in users' willingness to pay for different applications. Some popular applications may have low willingness to pay, but certain niche areas (like content creators) have a strong demand for high-quality experiences. We choose to integrate the best technologies, whether from Hedra or partners like 11 Labs, to ensure users receive the best experience.

Matt: In the future, will AI characters generate text, scripts, voice, and visuals from a single model?

Michael: I believe the industry is moving towards a multimodal input-output paradigm. The challenge of a single model lies in control. Users need to precisely adjust details like voice, tone, or rhythm. Decoupling inputs can provide more control, but the future may trend towards fully multimodal models, where users can adjust the fit of each modality through guiding signals.

(7) The Future of Interactive Video

Justine: I am impressed by Hedra's long video generation capabilities. You can upload a few minutes of audio to generate character dialogue videos, adjusting images and voices separately to avoid wasting resources on one-time generation. This level of control makes me very excited about the future of interactive video.

Michael: I am excited about the interactive avatar model we just launched. In the future, users will be able to shape video elements like on a fluid canvas, for example, pausing the video and asking the character to be more sad during a certain line. This two-way communication will bring about the next generation of experiences, and it will be realized soon.

Matt: Is a true AI actor possible? Users interact in real-time with the created characters and give instructions.

Michael: Absolutely possible. But the current limitation lies not in the video model, but in the authenticity of personality in large language models. Existing AI companions (like Character AI) still bear obvious traces of the model. To achieve truly interactive digital characters, more research needs to be invested in configurable personalities.

(8) Hedra's Audio Generation and AI-Native Applications

Justine: Hedra's videos are stunning, but the audio sometimes falls short. The latest model from 11 Labs has improved audio quality, but the content appeal still needs enhancement.

Michael: Audio generation is an underexplored field. Currently, generative speech is mostly used for narration or voiceovers, but generating natural conversations in scenarios like a noisy café remains challenging. We need audio models that can control ambient sounds and multi-turn dialogues to enhance the naturalness of video creation. Video AI is still in its early stages. Just like early CGI effects seemed realistic at the time but now look cartoonish, our first-generation models amazed me, but now they seem rough. Achieving highly controllable, cost-effective, and real-time performance models still requires effort.

Matt: Would users prefer to interact with real humans, quasi-human characters, or cartoon characters?

Michael: We have generated many fluffy little balls and cat characters. Hedra's unified model can handle various characters, whether they are rocks or robots, allowing users to experiment freely and create unprecedented content. We built a unified model, rather than traditional video with lip sync, to avoid limiting users by technology. Users can try "talking rocks" or "podcasts with robots and humans," and the model can automatically handle dialogue and personality. This flexibility inspires revolutionary consumer scenarios.

Justine: The cross-application of AI is exciting. Consumers create content like the "baby podcast," inspiring enterprise applications. I was amazed to see a baby ad generated by Hedra promoting enterprise software in Forbes. This indicates that companies are rapidly embracing AI, and we need to transform consumer signals into enterprise-level solutions.

Michael: Enterprises are our fastest-growing area. Generative AI has shortened content creation from weeks to real-time. For example, automated news anchors are changing the way information is disseminated. In the past, local news disappeared due to high costs, but now one person can operate a news channel. This "medium-scale personalization" meets the needs of specific audiences, such as precise advertising for local cuisine or theme parks, proving more effective than the overly personalized Google models.

(9) The Founder’s Journey: Challenges, Passion, and Collaborative Innovation

Justine: What has your experience been as a founder? What challenges and rewards have you encountered?

Michael: In San Francisco, the life of a founder is often romanticized, like a journey of building groundbreaking technology. I come from a small town in Florida and never imagined I would take this path. But 99% of being a founder is tough. You have to keep pushing; the problems never decrease—from invisible development to facing a flood of support emails. Physically, it can be exhausting, but the inner satisfaction is unparalleled. I love my users and my team; I can't imagine doing anything else. It's a kind of "second-order fun"—like climbing a snowy mountain, getting hurt in the process, but still wanting to come back after reaching the summit. I go into the office at 7:30 AM and leave at 10 PM, sometimes still discussing features at 2 AM. It requires giving up the boundaries between work and life, but my passion keeps me going.

Matt: Why do you still code personally? Is it to express creativity or to communicate with the team?

Michael: Both. Prototyping helps me quickly validate ideas and clearly communicate expectations. As a leader, clear communication is crucial. I discuss edge cases with designers to ensure the system is scalable. Coding keeps me connected with the team, understanding their challenges while quickly exploring product directions.

免责声明：本文章仅代表作者个人观点，不代表本平台的立场和观点。本文章仅供信息分享，不构成对任何人的任何投资建议。用户与作者之间的任何争议，与本平台无关。如网页中刊载的文章或图片涉及侵权，请提供相关的权利证明和身份证明发送邮件到support@aicoin.com，本平台相关工作人员将会进行核查。