Jehee Lee overviewed research focused on an innovative approach to Physics-Based Simulation of Biped Characters and involves Deep Reinforcement Learning.
I am a professor at Seoul National University and a computer graphics researcher with 25+ years of experience. I have been exploring new ways of understanding, representing, and simulating human/animal movements.
Physics-Based Simulation of Biped Characters
The physics-based simulation of biped characters has long been a notorious open problem since the mid 80’s in robotics and computer graphics. In the 1990s, most of the biped controllers were based on a simplified dynamics model, such as an inverted pendulum, that allows a balancing strategy that can be derived in a closed-form equation. Since 2007, the controllers using full-body dynamics formulation emerged to achieve rapid advancements in the field. Notably, optimal control theory and stochastic optimization methods, such as CMS-ES, have been major tools to maintain the balance of simulated bipeds. During the development, researchers have built progressively more detailed human body models. In 1990, the inverted pendulum model had less than 5 degrees of freedom (DOF). In 2007, the dynamics model was a 2D stick figure driven by motors at joints with dozens of DOFs. In 2009-2010, full 3D models with a hundred DOFs emerged. In 2012-2014, the controllers for muscle-actuated biomechanical models emerged. The controller sends an excitation signal to each individual muscle at every time instance to stimulate muscles. The contraction of the muscle pulls the attached bones to make them move. In our work, we used 326 muscles to move the body, which include all major muscles in our body comprehensively except for small muscles in the feet and hands.
Complexity of Biped Character Movement Control
The DOF of the dynamics system has increased very rapidly since 2007. The previous approaches in controller design suffered from “the curse of dimensionality”, which means that the required computational resources (time and memory) increase exponentially as the number of DOF increases. We employed deep reinforcement learning (DRL) to address the complexity of the musculoskeletal model and the scalability of biped control. Deep networks can effectively represent and store high-dimensional control policies (the function that maps states to actions) and explore unseen states and actions.
The key improvement is the way we handle muscle-actuation in the full body. We built a hierarchical deep network. The upper layer learns to simulation the joints of the articulated figure at low frame rates (30Hz), while the lower layer learns to stimulate the muscles at higher frame rates (1500Hz). The muscle contraction dynamics require higher accuracy standards than the skeletal simulation. Our hierarchical structure address the discrepancy in requirements.
What the Approach Allowed to Achieve
It is exciting to see our algorithm work for a wide spectrum of human movements. We don’t know how wide the spectrum is yet and we are trying to understand its limitations. We haven’t reached any boundary yet because of the limited computational resources available to us. It is showing us new, improved results whenever we put more resources, mostly CPU cores. The good thing is that reinforcement learning is computationally demanding at the learning phase. Once the control policy is learned, the simulation and control at run-time simulation are fast. The musculoskeletal simulation will soon be in real-time interactive applications, such as games.
Simulating Contraction of Muscles
We use a Hill-type muscle model, which is a de facto standard in Biomechanics. Our algorithm is very flexible, so any muscle contraction dynamics model can be incorporated. The use of a highly accurate muscle model allows for the generate high-fidelity human motions under varied conditions, pathologic gaits, prosthesis, and so on.
Utilizing Deep Reinforcement Learning (DRL)
We share the same fundamental idea with the Deepmind’s locomotion study, which is based on a stick-and-motor model. Surprisingly, the standard DRL algorithm works well with a stick-and-motor model, but it doesn’t do well with muscle-actuated models. You may check out the current state-of-the-art before our work is published below. The NeurlPS 2018 AI for prosthetics challenge was held last years. Their model has only 20+ muscles, but even the winner of the competition didn’t walk well.
This example shows the difficulty of learning muscle-actuated locomotion. Our hierarchical network model makes a break-through and makes it possible to apply DRL to a biomechanical human model with many muscles.
Results of Research
The performance, stability, controllability and visual quality of physics-based characters have been improved significantly. I often cannot distinguish motion capture data and physics simulation in side-by-side comparisons. Physically simulated characters will soon be available for real-time interactive applications as well as VFX projects.