Tripo CEO on Using AI to "Augment" Creators Not "Replace Them"
Tripo breaks down the technology behind its latest AI models and how they’re evolving from rapid asset generation into full-scale world-building tools for modern game development pipelines to augment the capabilities of creators, but not "replace" them.
One of the most actively evolving areas of AI's injection into the game industry is in the form of asset generation. While early solutions often focused on accelerating isolated tasks like modeling or texturing, newer platforms are beginning to integrate more deeply into production pipelines, influencing how teams prototype, iterate, and scale content.
In this interview, we spoke with the Founder and CEO of Tripo, Simon Song, to discuss the technical underbelly behind its latest models, including Tripo 3.0, the P-series, and its experimental “world model” approach.
Tripo’s tools are already being used in real-time UGC pipelines, rapid prototyping, and large-scale asset variation, with integration paths into industry-standard DCC tools and engines. According to Song, it's a focus on augmenting creators, but not replacing them, which has been at the center of much of the discourse and commentary around AI usage in game development.
Tripo AI has grown rapidly, reaching millions of users and tens of thousands of API developers. From your perspective, what has been the most surprising way developers are using the platform today?
Simon Song, Founder and CEO: One scenario that initially surprised us came from the creator ecosystems of UGC (User-Generated Content) gaming platforms. Developers aren't just using Tripo as a standalone tool to create 3D models; they are integrating it into real-time content creation workflows. For example, while exploring collaborations with creative game ecosystems like Eggy Party, we observed a fascinating phenomenon: creators were designing maps and interactive worlds in real-time. AI 3D generation allowed them to generate props or scene elements in seconds and place them directly into the level editor for immediate testing.
In the past, these assets typically had to be crafted one by one by professional art teams. Now, creators can generate content on demand and use it immediately during the creative process. From our perspective, this means AI is no longer just an asset generation tool; it is becoming an integral part of the UGC world-building workflow.
With the release of Tripo 3.0 and the new model lineup announced recently, what was the main technological breakthrough that enabled this new generation of 3D asset creation?
Simon Song: Tripo 3.0 utilizes a brand-new SparseFlex representation developed by our team. This method precisely captures model details, open surfaces, and internal geometries while supporting more efficient training strategies. This reduces computational overhead and allows for higher-resolution training, making AI-generated 3D models viable for large-scale commercial applications. Our flagship model, Tripo H3.1, further solves long-standing bottlenecks in character forms, faces, and geometric text. In our benchmarks, Tripo H3.1 leads the industry in core metrics such as input alignment, structural precision, texture quality, and generation speed.
Our other series, the Tripo P-series, focuses on solving the issues where current 3D mesh generation is often hampered by the compromises of serialization. Lengthy serialized data severely restricts generation efficiency, while unidirectional causal bias stifles global spatial interaction.
Starting from first principles, we used a brand-new mindset and algorithmic framework to rethink how 3D should be expressed and generated. Tripo P1.0 reconstructs the underlying paradigm of spatial generation for the first time, moving away from local computations of 3D objects in favor of a Unified Native Probability Space. The model no longer 'predicts' the next point; instead, it performs a macro-level probability collapse of the entire spatial structure.
The model first precipitates the 'foundation' of the form within a noise space; subsequently, complex topological relationships evolve based on that foundation within a unified probability space. Tripo P1.0 breaks the barriers of dimensional construction, allowing high-dimensional information to align instantly within disordered noise space. This enables extremely complex 3D topologies to 'converge' into shape simultaneously, achieving a fundamental leap from local stitching to global emergence.
The end result is that Tripo P1.0 can generate professional-level 3D models, with clean topology, stable wireframes, and engine readiness, in just 2 seconds. Furthermore, under this new approach, we have found that the potential for model editability and precision scalability is significantly optimized.
Many developers are curious about how AI-generated 3D assets fit into traditional pipelines. How does Tripo integrate with common game engines and DCC tools, and what steps are required to move assets from generation into production?
Simon Song: Assets generated by Tripo can be directly exported in industry-standard formats, such as FBX, OBJ, and GLB, and include complete geometric structures and texture information. This means assets can be easily imported into mainstream DCC tools like Blender and Maya for further editing, or brought directly into game engines like Unity and Unreal Engine for testing and integration. Our ecosystem plugins cover the most popular 3D creation tools and content engines. Additionally, many teams use our API to integrate generation capabilities into their own asset management systems or proprietary editors, enabling a more automated production pipeline.
In a real-world development environment, the transition from asset generation to the formal production pipeline typically involves several steps:
- Generation & Initial Screening: The team generates multiple versions of an asset via text or image inputs and selects the result that best fits their requirements.
- DCC Optimization & Refinement: The asset is refined in a DCC tool—tasks include optimizing topology, controlling poly counts, and adjusting UVs or material details.
- Engine Integration & Testing: The asset is imported into the game engine for real-time testing and scene integration, where technical settings like colliders, LODs (Levels of Detail), or physical properties are added.
- Version Control: Finally, these assets are incorporated into the team’s version control and resource library systems as official production assets.
By generating high-precision meshes and textures, Tripo significantly reduces the time cost from initial concept to finished product. Tripo offers a robust tool pipeline capable of converting high-poly models to low-poly versions, supporting both triangular and quad-face topology, editable textures, and the generation of rigged animations. Through DCC plugins, these can be imported directly into engines like Unity for pipeline development. This boost in overall production efficiency significantly reduces the time required for traditional modeling and asset preparation, allowing teams to focus more on creative design and gameplay experience. Furthermore, because the generated assets are already close to production standards in structure and format, the rate of rework in subsequent stages is effectively lowered, further enhancing the overall efficiency of the team.
Can you explain what a World Model is, and how temporal or multimodal generation could influence future game development workflows?
Simon Song: A "World Model" means that the interactive expression of a game is no longer limited to just meshes and PBR (Physically Based Rendering) performance. In essence, this concept allows AI to not only understand a static 3D object but to comprehend how a world changes over the dimension of time. While traditional 3D generative models primarily focus on "generating an object or an asset," models like Tripo W1 are more concerned with how a scene evolves over time and how different elements interact with one another.
In practical application, this means developers can use multimodal inputs - such as video, text, or images - to allow the AI to infer the spatial structure of a scene, the relationships between objects, and potential dynamic changes. For example, a video contains more than just visual information; it implies movement trajectories, spatial layouts, and inter-object relationships. Through a world model, AI can understand the environment from this data and further generate 3D scenes or assets usable for game development.
However, it must be acknowledged that the entire industry's work on world models is still in the early research stages, so its impact on the gaming industry remains more of a visionary outlook for now. For the game development workflow, this capability could bring about a major shift: developers would no longer be required to build every asset or scene from scratch. Instead, they could quickly generate the base structure of a world from real-world footage or reference videos, which the art and design teams would then refine and perfect.
From a long-term perspective, developers will move away from merely crafting individual assets and toward defining rules, styles, and gameplay logic, with AI assisting in the generation and expansion of the entire world. We believe this represents a critical stage in the evolution of AI - moving from "generating assets" to "understanding and generating worlds."
One of the biggest challenges for AI-generated assets is maintaining consistent topology, UV layouts, and optimization for real-time rendering. How does Tripo address these issues to ensure assets remain usable in real-time engines?
Simon Song: First, we optimize the UV occupancy of generated assets, typically achieving an occupancy rate of 60% or higher. This significantly reduces wasted texture space and maximizes texture resolution utilization.
At the geometric level, we optimize the correspondence between high-poly and low-poly models. Based on this, we ensure the baking of accurate normal maps and generate multi-level LOD (Level of Detail) models. This allows a single asset to switch between different precision levels within the engine pipeline based on specific performance requirements.
Tripo already has integrations with companies like Stability AI, Scenario.gg, Riot, and EA. How are studios currently incorporating AI-generated assets into their workflows, and where are you seeing the biggest efficiency gains?
Simon Song: An increasing number of studios are beginning to incorporate AI-generated assets into their actual production pipelines. However, this is typically not a total replacement of existing workflows, but rather a powerful tool for accelerating content production and iteration. In many teams' practices, AI generation is most commonly used during the early concept exploration and asset prototyping stages. For example, when designing new props, environmental elements, or architectural components, teams can generate multiple versions with different styles or structures in a very short time to quickly validate visual directions and level layouts. This approach significantly shortens the time from initial concept to testable asset.
Another key application is the mass production of asset variants. In open-world games or UGC platforms, there is often a need for a massive volume of variations for props or environmental assets—such as different styles of buildings, decorations, or natural elements. AI helps teams rapidly generate these variants, which are then screened and refined through standard art and technical pipelines.
From the perspective of efficiency gains, we currently see the most significant changes in three specific areas:
- Speed from Concept to Prototype: Teams can obtain usable 3D assets in minutes rather than spending hours or days modeling from scratch.
- Asset Variation and Content Scaling: AI enables teams to generate a vast amount of usable material in a short period.
- Creative Iteration Speed: When a design or level needs adjustment, developers can quickly generate new versions for testing without having to restart the entire asset creation process.
Therefore, we believe the most critical value of AI at this stage is helping teams liberate more time from repetitive production tasks. This allows creators to focus more on gameplay design, artistic direction, and world-building - the elements that truly define the gaming experience.
For developers experimenting with AI-assisted pipelines today, where do you see the biggest opportunities and limitations of AI-driven 3D generation?
Simon Song: I believe the greatest opportunity that AI-driven 3D generation offers developers lies in the drastic reduction of content production costs and the barrier to entry. In traditional game development workflows, 3D asset creation is often one of the most time-consuming and labor-intensive stages. Through AI, developers can generate prototype assets in seconds and rapidly explore different styles and design directions, thereby significantly accelerating iteration speeds.
This is not only valuable for large studios but holds even greater significance for independent developers and small teams, as it enables them to complete more complex world-building with limited resources. Another major opportunity is the expansion of content scale. As AI generation capabilities improve, developers can more easily create vast quantities of variant assets—such as environmental elements, props, or architectural components. This is particularly crucial for open-world games, UGC platforms, and products involving real-time content generation. Ultimately, AI helps teams shift their energy from repetitive production work toward creative design and the gameplay experience.
What are your thoughts on AI usage for creative endeavors such as this, in regard to the ethics of its use, potentially replacing developers, and the general controversy around the technology?
Simon Song: We consistently view AI as a tool to augment the capabilities of creators, rather than a tool to replace them. The value of AI lies in helping teams reduce highly repetitive production cycles, allowing artists and developers to dedicate more time to pure creativity. From the perspective of industry evolution, technological progress typically reshapes job structures rather than simply eliminating roles. For example, as tools have evolved within the gaming industry, new roles have emerged—such as Tools Engineers and, more recently, AI Tools Artists.
I believe AI will bring about a similar transformation. In the future, we will see more roles centered around AI workflow design, data management, and creative control. Of course, we also attach great importance to issues such as data usage, copyright, and creator rights. Establishing a transparent and responsible technological ecosystem is vital; this requires the joint participation and standardization efforts of regulatory bodies, content creators, and industry organizations.
Overall, I do not believe AI will diminish the importance of human creativity. On the contrary, it will lower the barrier to entry, allowing more people to participate in the creation of 3D content and interactive worlds. The true value lies in how we combine technology with creativity, rather than allowing one to replace the other.
Looking ahead, what does the next evolution of AI-powered 3D creation look like, and how might tools like Tripo reshape the role of artists and technical artists over the next few years?
Simon Song: I believe that AI-driven 3D creation is moving toward 'understanding and generating entire worlds.' In the past few years, most tools primarily addressed asset-level challenges—such as generating a single character, prop, or scene component. However, with the emergence of world models like Tripo W1, AI has begun to understand more complex spatial structures, object relationships, and changes across the dimension of time.
AI can assist in rapidly generating base content, while Artists will be responsible for defining the visual direction, selecting and refining the generated results, and integrating them into a cohesive artistic system. For Technical Artists, this shift will be even more pronounced. Future TAs will need to go beyond understanding traditional rendering, asset optimization, and toolchains; they will need to participate in the design and management of AI workflows. This includes tasks such as integrating generative models into production pipelines, ensuring stylistic consistency of generated outputs, and ensuring that AI-generated assets meet the technical standards of real-time engines.
Therefore, I believe we will see a new creative paradigm emerge in the coming years: AI handles large-scale generation, Artists make the creative and aesthetic decisions, and Technical Artists integrate these capabilities into stable, efficient production pipelines. In this model, AI empowers creators to participate in content creation at a much higher level.