Yves Jacquier, Executive Director of Production Services Studios at Ubisoft Montreal, discussed their mission at Ubisoft La Forge, a unique R&D lab where researchers and game developers push the envelope and implement new technologies in production.
The Mission of Ubisoft La Forge
We created Ubisoft La Forge almost 5 years ago with the driving objective to bridge the gap between academic ideas and videogame innovations. This means building something that is of genuine interest for both academic and production people. Keeping this balance even between them is not always easy, but we developed expertise over time. The fact that I’m a former researcher while evolving at different positions at Ubisoft certainly helped to make both worlds meet.
We have made efforts to create a meaningful collaborative R&D space. The idea is simple: researchers from the academic world work on their research, not for Ubisoft. And by doing that with us, students and researchers have access to all our resources, exactly like our employees do: our data, our experts, our technologies, things they generally miss in their lab. Moreover, the nature of videogames makes it appealing to try things in virtual environments before applying them in the real world, and this is true in many domains: AI, but also biometrics, humanities, etc. So, we offer them interesting challenges, but also a safe and rich environment to push the envelope.
On the other hand, this is a safe space for Ubisoft employees to try things and nurture themselves by trying things that would be too risky on a AAA production. Ideas collide and with a little love it sparks very interesting prototypes that can lead to academic publications or find concrete applications in a game or during its production.
Introducing Innovative Technologies into Pipeline
We’ve been working on Deep Learning for more than a decade because we had this intuition that it would have a huge impact on many aspects of production. From a technical standpoint, it is a way to accelerate the production of assets, for example, animations, but it can also boost indirectly the production process by clarifying biometrics feedback or detecting online toxic behaviors. We’re not an AI Lab, but AI is a fundamental component of at least 80% of our prototypes. I made a Game Developers Conference talk in 2019 that shows many areas where Machine Learning has a production impact:
For us, the way our work and research can benefit the game is three-pronged:
1) assist our creators (can we automate some tasks of lesser value to help our prodigious talents concentrate on the most amazing aspects of their crafts?),
2) create more believable worlds (how can we create more variety and consistency within our rich worlds?),
and 3) improve the player’s experience (can we create new interactions, new gameplays out of those prototypes?).
Our role is to offer to our game creators new approaches to create and let them decide where it will have the greatest impact. And this is part of our change management approach.
Ubisoft started learning to create open worlds with the first AC. Amongst the things we learned was that creating rich, appealing, and believable worlds cannot be done entirely manually. Nobody wants to create manually each leaf of each tree in a forest, make it look like no pair of trees look similar, and then iterate to create a 4-seasons forest. The same thing goes with huge cities. This is where procedural technologies step in: instead of producing each asset by hand, you define elemental assets like leaves, high-level rules like how many branches a tree has and how they are placed, the kind of variations you want, and you let the tool follow this procedure to create assets that eventually become a forest.
The benefit of procedural tools – whether they are AI-fueled or not – is to be highly scalable (the size of the forest is not an issue) and it frees up game creators to concentrate on the most important parts of the worlds, thus focusing on the highest value elements of the game. Here's a couple of GDC talks on this topic: Procedural Generation of Cinematic Dialogues in 'Assassin's Creed Odyssey' and 'Assassin's Creed Syndicate: London' Wasn't Built in a Day.
Machine Learning is not a one-size-fits-all solution and it certainly should not be considered as a solution in search of a problem to solve. It’s an additional tool to create, for example, better animations. And like any tool, it comes with strengths and weaknesses. The way it works is conceptually simple. Instead of programming rules with a lot of “if this happens then do that …” you throw a lot of data that serve as examples of the desired behavior and the program “learns” from this dataset to create rules. These concepts sound programmer-oriented but can also be applied to assets. By using a huge quantity of MOCAP data you can teach the system what animation goes with which context of terrain, slope, etc. up to generating it on the fly. And if you want to add another type of animation – let’s say crouching – you don’t need to double the capture of data. With relatively few examples, the system will extrapolate what animation to generate based on the walk dataset featuring some crouching component.
While controlling characters in video games is about programming each action or controlling each animation, Machine Learning based techniques offer more scalability, more diversity, and feel more natural, but at the cost of less control: when a new situation occurs, Machine Learning can do unexpected things. In some situations, like specific slopes, it might look weird if there were too few examples. But it also means that the scalability is orders of magnitudes higher than traditional methods because when you want to change something you don’t need to update complex decision trees of “if-then” or capture specific clips to cover everything, you simply throw data with new examples.
This capacity to generalize is strategic, and we believe that it is key for more immersive experiences, like offering higher quality and more life-like movements in any situation. Yet, there is still a lot of challenges to solve and we work hard to understand and improve those techniques to make them applicable. For animations, it means working closely with animators and gameplay programmers to develop new tools together with them, while understanding the level of control they need to keep. And this is still very much R&D, as we are applying this into production but also publishing papers at the same time.
The main idea is simple. If you have tons of examples that lead to specific outcomes, you are able to predict with a level of probability that a new input leads to this or that outcome. And one of the areas for which we have tons - literally decades – of data, is lines of codes, history of bugs, and what solved them.
3 years ago we introduced the commit assistant that led the development of the clever commit initiative:
We got the attention of many software computing leaders and it eventually led to a partnership with Mozilla to push this innovation further.
From a research standpoint, the objective is to increase the reliability of predictions that lines of codes can contain a bug.
From a production standpoint, the objective is to make it an assistant to the programmers' pipeline to catch efficiently new codes that might be problematic before it is tested. Not only is it more efficient to spot problems, but it helps our testing teams to concentrate on trickier bugs or issues by filtering “self-contain bugs” early in the process.
It sounds simple, but there is a huge technical challenge to make it applicable. First in the data itself, because the same bug can be documented very differently, and the same thing goes for the actual lines of codes. Second to make it reliable and explainable. As a programmer, you are used to peer reviews or bug reporting clearly calling a bug “a bug”. Now you need to cope with a tool that projects a probability of 75% that your code contains a bug with an accuracy of 80%. Not only does it need to develop new skills to deal with statistics, but you also need to trust this prediction as a human. And to be trustworthy, such systems need to be explainable. It means that the challenge is probably bigger to implement such tools and make them really useful than it is on a purely technical aspect. What started as a bug catching prototype 3 years ago has evolved into a dedicated team that now serves almost all our productions around the world.
Another area where this applies is failures prediction. We have millions of daily log lines describing the state of all our services and their interconnections. Some that we manage (our services) some we don’t (the reliability of your ISP). So we work on such techniques to predict that a failure is likely to happen. Once again it needs to be reliable and explainable and this still represents a major challenge.
Combining Machine Learning and 3D Scanning
3D scanning is extremely powerful when you need to get a realistic geometry with physically plausible textures that deal with all aspects of lights. It also means that the resulting assets can be used as-is or can serve as a baseline, but with caveats. What if you want to 3D scan cloths? What if you want to create a variety of realistic human faces? That’s where procedural or Machine Learning techniques can help, by using the captured data to create more data. If you have scanned enough heads, Machine Learning can use this to generate new realistic heads based on your needs: ethnicity, age, gender, etc. Obviously, this can be applied to other assets, like stones, objects, etc.
We’ve recently published a paper (see video below) describing how we use such techniques to generate a variety of realistic faces, especially in terms of skin texture. We were able to create tools that artists can use to generate and modify heads with a few parameters that aim to give complete control while generating believable heads. This opens up new possibilities to create a wider variety of high-quality characters efficiently.
Creating Engaging Worlds
We’ve been working on animations: instead of playing clips of animations depending on the context, we generate them on the fly. It allows smoother transitions like in Assassins Creed Valhalla, for example.
We’ve been working on autonomous agents like vehicles. Instead of programming the behavior of the agent, you let it discover the effect of inputs (steering wheel, brakes, accelerator, etc.) on outcomes (driving in such or such conditions).
We’ve been working on the interactive real-time fire and smoke simulation to procedurally create all sorts of fires.
But really, all of these are examples and should not be seen as a one-size-fits-all solution. It’s up to game creators to decide where they want to put focus in the virtual worlds and what are the most convenient solutions to do so. An enticing, believable, and fun to play world is not the sum of its components that all need to be over the top. What makes a believable crowd? The animation of each NPCs? The variety of models? The behavior of the crowd “as a whole”? It all depends on the context and what is most relevant to the player experience.
What Are the Benefits of Automated Processes in Numbers?
We do have such numbers, but they’d work only in our context. What is important to us is that Ubisoft takes the approach of reinvesting all the productivity that we get from technologies. It means that you might still spend the same resources and efforts overall, but you will be able to focus on what really matters.
The benefit has to be seen in terms of variety and scalability. The expected gain is directly related to what could or should be automated at scale. Don’t bother with such techniques if you need to create few trees in a 2D game. Go for it if you need to create huge forests with a variety of trees.
Another very important benefit is that automating the most boring tasks is also a way to keep our creators motivated. They focus on things of higher impact, and we work hard to create those tools and techniques with them, not for them.
How Production Might Change in the Future
I see three different approaches emerging.
The first relies on automating more and more tasks to shrink the time between an idea and its delivery, which is especially important for live games. It changes the way we work and requires adapting our crafts. If an unforeseen event happens today in the real world you might want to create a related mission in the game, which can be challenging, expensive, or impossible if you need to record actors, create or adapt animations, program behaviours, etc.
The second can be seen as “existing tools on steroids”. It does not change the way we work and are organized, but we do it more efficiently. For example, Photoshop recently released advanced photo editing features that accelerate things such as changing skies or aging people. The same goes with specific features to manage all types of data, from textures to animations, from geometries to VFX, from dialogs to QC.
And finally, it will open up richer interactions. With those techniques, smaller teams who do not have access to a mocap studio or a pool of very talented artists can do what was only accessible to huge production teams a few years ago. This is true in terms of asset creation but also in the ways interactions are shaped. Will the game predict and adapt the difficulty based on your playstyle (and not only your level) in a meaningful and coherent way? Will it empower players to create rich content that fully contribute to the experience? I do not know exactly what the future holds, only that we all need to be prepared to be surprised. And not only incrementally.