LIGHTSPEED STUDIOS' Nan Ma has told us about Wobbledoll, an advanced full-body ragdoll system introduced during GDC 2023, discussed the system's features, and explained why machine learning was implemented into it.
My name is Nan Ma. I’m a Senior Game Engine Developer at LIGHTSPEED STUDIOS. My tech background is physics simulation, and I have more than 15 years of experience in game physics R&D and content production.
Before joining LIGHTSPEED STUDIOS, I was the Technical Director at Future Immersive, which is an indie VR game studio. There, I led the team to create one of the earliest VR multiplayer naval combat games, Furious Seas, which won the GDC Best in Play Award in 2018. At the early stages of my career, I worked at Rockstar Games as a Game Physics Programmer, where I contributed to some of their most critically acclaimed game titles, such as GTA V, Max Payne 3, and Red Dead Redemption 2.
I've always been passionate about game tech R&D and commercializing it in game production. This is why I joined LIGHTSPEED STUDIOS’ research team, which is formed by some of the most talented researchers and engineers across the world. Here, I can dedicate my time in cutting-edge tech R&D and help the production team deploy these techs to their projects.
The Current State of Full-Body Physical Interaction Solutions
The current full-body simulated character solution that most commercial game engines offer is the ragdoll system. It is achieved by mapping the character skeleton to an articulated body and running physical interaction simulation on it. The limitation for such a solution is that the articulated body simulation has no balancing capability, nor can it replicate target motions. Because of that, the application of the ragdoll system is very limited, which is why it is mostly used for death animations.
Unreal Engine offers a partial body physics-based animation system, which uses kinematic body or constraint to pin the body root on animation coordinates. It then runs the articulated body simulation on certain body parts to replicate the target motion. The advantage of such a solution is that it works around the balancing issue and is able to perform physical interaction to certain degrees. However, pinning the body root is a work-around solution, and it does come with limitations. For example, this solution cannot handle any full-body physical interaction that involves the root part.
There is a more sophisticated physical animation middleware, Euphoria, developed by Natural Motion, which showed some promising results in their demos. Unfortunately, access to Euphoria is very limited. Only a few AAA game studios used this middleware in their games.
For the majority of the game developers, there is simply no comprehensive full-body physical animation solution available that can perform a wide variety of physical interactions. This is the technical challenge that Wobbledoll system tries to solve.
In short, Wobbledoll is an advanced full-body ragdoll system that can perform given actions while balancing. Because of its excellent balance capability, the Wobbledoll character can regain balance after being physically impacted or pushed. On top of that, the Wobbledoll system works cohesively with the kinematic animation system.
Wobbledoll only gets activated when certain physical interaction happens to a character. It will blend back to kinematic animation when the interaction is over and the motion is settled. In some extreme cases, where the character gets knocked over, the Wobbledoll system can perform custom motion while falling and blend to kinematic get-up animation when the character settles on the ground. Given Wobbledoll can handle all kinds of physical interactions and is able to seamlessly transition between kinematic animation and physical animation, it can be used as a comprehensive solution for humanoid character interactions.
The Wobbledoll system is heavily inspired by three ML-driven humanoid physical animation research papers: DeepMimic [Peng et al. 2018a], DReCon [Bergamin et al. 2019], and UniCon [Wang et al. 2020].
DReCon is a very cool algorithm that enables training ragdoll to perform locomotion in simulated environments. Additionally, the ragdoll can traverse rough terrain and withstand impacts from projectiles. DReCon was developed on top of a game engine, and its update pipeline works cohesively with a game’s native pipeline. Wobbledoll’s update pipeline is heavily inspired by the one introduced by DReCon.
DeepMimic is likely the first paper to explore training a ragdoll agent to accurately imitate target motion. Additionally, the trained ragdoll can also carry out assigned tasks, such as walking to a specific destination or kicking a particular target. The DeepMimic simulation environment utilizes a more advanced motor controller and articulated body solver. We implemented the same controller and solver for the Wobbledoll system as well.
The UniCon uses a dual-layer control scheme for ragdoll controlling. Its lower layer controller is similar to DReCon and DeepMimic, taking animation motion as the target and driving the ragdoll to replicate accordingly. Its upper layer controller. It can also support a wide range of control mechanisms, such as keyboard input and video input. Hence, it is called UniCon, which stands for Universal Controller. Another key contribution UniCon offers is that it improves training efficiency, robustness, motion capacity and generalization. Some of their ideas have been implemented in Wobbledoll to improve the character balancing capabilities.
Implementing Machine Learning Into the System
Ragdoll balancing is one of the most complicated problems in the physical animation field. It involves motion planning, physics-based modeling, robotic dynamics and control. In recent years, the research trend has been leaning more towards utilizing ML to train ragdolls to balance themselves, and it achieved some very promising progress. On the other hand, there are quite a few open-source ML projects available for access.
For instance, ML-Agents have been developed for the Unity engine. Developers can utilize these ML training frameworks and focus their time more on building the training environment and fine tuning it. Depending on what training framework you are using, the steps might vary here and there. The general steps are:
- Setup the training environment, which includes the simulation and control scheme, defining observations, actions, and reward functions.
- Then, set up the NN parameters, such as network size, policy update algorithm, trajectory batch size, learning rate, etc.
- Then it's down to the actual training process, which can be done on a local machine or on a server cluster as distributed training.
- Once the training is done, it will generate a policy module, which can deploy in your game application for local inference.
The Run-Time Module
The system consists of two parts, the training module and the run-time module. In the previous question, I talked about the training steps, so here I can elaborate more on how the run-time module works. Once the training session is complete, it will generate a policy module, which can be loaded into the game for local inference at run time. The run-time pipeline is straightforward. It starts with a kinematic animation update. Then the animated pose and current simulated pose can be passed to the trained neural network as input, which will output a control signal for ragdoll manipulation. Such a signal can then be sent to physics update for control and simulation. Optionally, some further motion blending or pose adjustment can be done during the post physics update before sending the character to rendering. Engineering-wise, the only additional implementation is to perform the local inference and apply the action to the target pose. Kinematic animation and physics simulation/control are usually provided by the commercial game engines.
On the training side, as mentioned earlier, there are ready-to-use ML training frameworks that support most of the mainstream ML algorithms. The development team could utilize one of them and integrate it to work with their training environment. The training and tuning part, however, might be the most challenging and time-consuming part of the process, as each session can easily take a few days to complete depending on how many environment instances can be running in parallel. One trick to be considered here is to use tuning values that are provided by research papers, such as DReCon or DeepMimic, as a starting point and go from there. If the hardware bandwidth is allowed, multiple training sessions can be kicked off with different tuning values to speed up the process.
We are planning to improve the Wobbledoll system in two key aspects. One is to improve the motion quality by allowing the agent to control motor strength and stiffness. This is similar to how the human muscle system works, as our muscle tensions can change depending on the activity.
The other direction we are looking into is how to improve generalizability. Recent papers have proposed a new approach of using Generative Adversarial Networks to improve generalizability. We are currently experimenting with these ideas to see if they could work out with the Wobbledoll system.