Game studios Myrkur Games, Hazelight Studios, and V1 Interactive talked about their approach to realistic animation with the help of mocap technology and VICON.
About The Darken
The Darken is a single player fantasy action-adventure game focused on highly immersive storytelling and engaging third-person gameplay.
Our goal for the project was to allow creative freedom to take charge, and do whatever we want as long as it felt new and exciting. So when we say “fantasy game” we don’t really mean you should be expecting elves, magic wands or gods from ancient mythology – instead we’re making something that we hope will feel like a breath of fresh air to fantasy fans, while establishing a unique identity for the world of the Darken.
In the game, you take on the role of Ryn, an adept fighter and protector of the heir to a powerful kingdom during a period of political turmoil. Ryn is in many ways defined by this role, being raised from birth to wield her unique ability to unmake the weaves that compose the world. She is known and feared as one of the most terrifying weapons in the crown’s arsenal and wielded indiscriminately against its enemies. The game’s story begins when Ryn is given the task to assassinate a foreign prince, but all is not what it seems. The assassination sets in motion a series of events that force her to question everything she thinks she knows about the world, and about herself.
Just a few years ago, making a game like The Darken with such a small team would have seemed foolish, if not just completely impossible. But with recent leaps in technological advancement across almost every field of game development, as well as the increased availability of immensely powerful tools and game engines that now handle so much of the work, the landscape today is totally different. And we’re very excited about it!
Animation & Mocap
Animations make up a large part of the production for The Darken, so they are a huge topic for us. While there are some pretty good pre-made animations available on asset stores, they aren’t a one-size-fits-all kind of thing. We needed to make our own animations, and doing them all by hand seemed impossible. Initially, we started out by buying a single inertial capture suit. After some testing, we found out pretty quickly that inertial suits weren’t the way to go for this project. We needed better data, and we needed to support the tracking of multiple actors and props in a single scene, all at once.
We reached out on a Facebook group called “Motion Capture Society”, where we got some solid advice and recommendations and decided to look for an optical motion capture solution instead. We got in touch with Vicon, who came highly recommended by the group. They offered to fly us out to their headquarters in Oxford, where they demoed a replica of the setup we had in mind – giving us a very accurate look at what we could expect from our captures. And honestly, the results we saw were staggering. The data was remarkable, and their software Shogun takes care of so much of the work, including great occlusion handling and even real-time streaming from the capture right into the game engine. We realized then and there we needed to work with Vicon, and we’ve been working with them ever since!
The setup we built is a 100sqm dedicated studio with a 60sqm capture volume with decent ceiling height. We currently use 12 Vero 2,2 cameras, one monitoring camera, and a single powerhouse computer to run everything. It may seem that you’d need way more than 12 cameras for that space, but honestly, the hardware in those cameras paired with great software from Vicon just gives us phenomenal, clean data all around, even with multiple actors. We also sound-proofed the studio and treated the walls with acoustic foam to improve the quality of our sound captures. For the flooring, we bought some basic affordable gym mattresses, which have worked very well for us to dampening the sounds of footsteps as well as providing a soft floor for our more action-heavy mocap.
We built the studio literally next door to our office, which has turned out to be very convenient. Having our own studio so close by means it’s very easy for us to mocap whenever we need to. Even our devs will jump in to record some animations, as it’s a very quick and easy process to hop into a suit and start recording by yourself.
Cutscenes and Dialogue
For our cutscenes and dialogue, we record the body, face, and voice of our actors all at once. To capture the body we’ll use the Vicon system, but for the face, we’ll use facial-capture helmet equipped with a camera from Faceware plus some simple LED strips to provide consistent lighting to the actor’s face. As for sound, we’ll mount wireless lavalier mics to our actors’ helmets, which has worked really and give us isolated captures from each actor. Cutscenes and dialogue are probably our most straight-forward captures, as the data from the capture is already really good and do not require much processing and cleanup. Most of the work is fine-tuning the data as necessary, adding finger movements (although Vicon just released a beta of Shogun with finger tracking support!) and making sure our props are behaving well in the final assembly.
The Darken has quite a lot of 3rd person combat, so there’s a lot of ground to cover for all of the different animations required for the main character, all of the side characters and even the different enemy types you’ll encounter. These animations are going to be played hundreds or even thousands of times during a single playthrough of the game, so they need to be very satisfying to watch. We bring on combat choreographers and stunt actors to perform these movements to get the best results from the captures themselves, but we still work quite a lot on every animation to get them to match the intended feel of the game (which isn’t as grounded as reality). We’ll record each motion as an individual animation (such as an attack, sidestep or parry) using Shogun. We’ll process it in Shogun post and then refine and adjust the animations to match our visual target and technical requirements (such as speed, displacement, etc). We’ll iterate on this step a lot until the animation feels just right and really satisfying to watch inside the game itself.
Locomotion is something we’re still working on getting to feel great. Initially, we had the same approach as with our action captures, recording individual movements and transitions. However, we constantly ran into overlapping feet, hovering characters and just downright strange and unexplained behavior. Locomotion is hard! We didn’t quite achieve the result we’d been hoping for so we did some digging and learned all about something called motion matching, which completely blew our minds. For anyone interested, here’s a great talk by Kristjan Zadzuik on the topic:
In essence, instead of recording individual animations you record a whole session of an actor moving in all the different ways you imagine you’re character should move. Then you take that entire clip and use it kind of like a database of animations to query. When the character is moving inside the engine, it checks to see “what is my character doing right now” and queries the database for a matching motion (hence the name) in the recording. We tried it out last week, and with a single day of testing, we already had a very believable, playable character inside the engine. It felt like magic! We’ll touch on this again in the fall, once we wrap up our upcoming story shoots.
Animation of Photorealistic Characters
Photorealistic characters are extremely tricky to bring to life within a game engine and there’s a wide variety of topics here that I could address, but I’ll try to keep it as direct as possible.
Step 1: Creating the character models and rig
To create our digital doubles, we start by photoscanning our actors in our in-house photoscanning rig. We’ll typically scan a wide range of expressions with our actors to try and cover the whole range that we want to be able to recreate inside the game. Once we’ve processed all the different scans into optimized blend shapes, we hook them up to our custom facial rigging system (FACS based approach).
For hero level characters, we’ll do a combination of joints and multiple blend shapes, but for our side characters, we can often get away with much simpler, joint-based rigs.
Step 2: Animating the rig
Once the character rig is set up, we’ll video record an extensive ROM (range of motion) with the actors using facial capture helmets. We have them do various phonemes, expressions, and lines of dialogue in different emotional states, which we then feed into Faceware, where we analyze and track their facial movements. Once we’ve tracked everything, we then pose the rig to match the different facial expressions from the footage, which essentially teaches Faceware how to match and move the rig according to the analyzed footage. After we’ve done this once, any additional recorded facial footage can be processed automatically in batches with little to no added manual labor.
Capturing Facial Animations: Challenges
Getting facial animations correct was definitely one of our biggest development challenges. The human face something we’re all very familiar with and even the slightest expression reads clearly since the human eye has evolved to read the fine details in faces. This, of course, leaves very little margin for error. The most important lesson, as always, is to pay close attention to the reference footage and iterate the results as much as possible. One of the primary reasons for why we opted for building our own photo scanning rig, was to remove the reliance on third-party scan services. We needed to be able to do multiple scans and rescans if necessary to get things just right. Plus, with a large cast and lots of background actors, the only way to keep up the character fidelity throughout the game is to photoscan as much as possible.
One of the challenges we faced was getting the normal maps to work with our rigs inside the engine to add more details to different expressions. We’ll pack the normals from a few scans into optimized normal textures, and then mask out the specific part of the face we were targeting with a given joint movement or blendshape change. Through that, we could seamlessly blend in extra-normal details as necessary, such as wrinkles during an eye squint or smile. What helped us here was setting up a live link with Unreal Engine 4, so that we could easily preview the final result with all of the correct texture maps and shaders on top.
There isn’t really a decent handbook out there on how to do rigging and animating for photoreal rigs, especially since the technology and tools are constantly evolving and most larger studios rely on outsourcing to specialized studios or use their own custom solution which won’t necessarily work for others. There is a lot of trial and error in the process such as getting the blend shapes to work together nicely, in an optimized enough way to run smoothly with the rest of the game.
In closing. As a small team working on such an ambitious product, we constantly have to innovate and find new ways to achieve something usually done by much larger teams. For us, it’s been very important for everyone on the team to be adaptable and willing to find new solutions to big problems. But with things like motion capture, photogrammetry, advancements in game engines and lots of new powerful tools within reach, smaller teams are now able to do much bigger things than ever before. I think as a result of that, we’re going to see more high-quality games pushed by smaller teams in the coming years. Hopefully, that means we’re going to see new types of games, genres, and products who dare step outside the comfort zone of established IP and game sequels. And that’s something we’re really excited about.
Emil: We always try to replicate the digital world as best as possible for the actors when recording scenes – it helps with the immersion. We were also able to preview the scenes with the game characters and the level geometry in Motionbuilder which helped us immediately visualize how well the captures would fit and feel in the digital world. During the capture sessions, we always made sure to have as few – but relevant – people as possible and we always felt comfortable taking quick and sometimes drastic decisions. A big advantage was that the capture, selects, tracking and retargeting was all handled in-house by me so we could have a very quick turnaround time and flow of information within the animation department. Having some good and easily accessible reference media also helped the animation team to achieve the vision we had during recording.
Utilizing Motion Capture Systems
Emil: Since the beginning of Brothers we have been using the Optitrack system which has served us well. After A Way Out though I got the opportunity to build our new motion capture studio and searched for what would be the best fit with the requirements we had in mind. During A Way Out we actually never did performance capture which was one of the things we now wanted to support. When setting up a studio for performance capture it’s very important to look at the bigger picture and how everything ties together within the pipeline for the body, face and audio capture with timecode to prevent all form of eyeballing and manual syncing of data. It especially applied to a smaller studio like ours where the mindset is to not throw people at the problems. A key feature achieving this was the versatile scripting API provided by Vicon that would allow us to do exactly what was needed for this.
A motion capture system for me nowadays is not so much about the hardware, but rather the software and workflows around it and with the recent release of Shogun you could really see that Vicon has been working much on improving that.
One of the bigger differences is that we all feel much more confident in the captures and can focus more on the acting rather than stumbling on technical stuff.
[80lv: Did you ever consider using inertial sensors? Do they actually give you any advantages?]
Emil: While I do appreciate the advantages of an inertial-based system where I can take the space constraint of the stage out of the equation for certain moves, I believe that at this time optical is still advantageous for cinematic recordings. I think it would be optimal to have both, utilizing what is best for the situation. I can see why many studios that don’t have their own motion capture studio invest in an inertial suit where they quickly can capture animations for themselves.
With our requirements for performance capture and the more cinematic recordings in mind, an optical system was an easy choice.
What Will Come in the Future?
Josef: I think everything has to and will get easier in the future. The dream is to be able to record performance and send it straight into the game. Just imagine complex and emotional scenes you want to use real actors for and the ability to record them anytime anywhere in any environment.
Emil: I think there’s a lot of things that can and will be improved upon in the capture process and that’s something we have seen increasing the last couple of years with – for example, implementing machine learning, improved finger solving, etc. I think there’s still room to automate many procedures in the data cleanup process which I’m sure will be closer to fully automated in the future.
Even though we get an increased quality and accuracy in the captures moving forward I doubt we will see a decrease in the need for animators stylizing and exaggerating the captures as I’m sure we will keep on being creative in the style of games that we all want to create.
The answers below were given by Marcus Lehto, V1 Interactive’s President and Creative Director. A 20+ year veteran of AAA games and co-creator of the Halo universe, Marcus founded V1 Interactive – a studio filled with passionate and talented developers dedicated to making great high-quality games. When not making games, Marcus can be found hiking through the mountains or motorcycling off into the horizon.
Private Division and V1 Interactive announced Disintegration, an upcoming sci-fi first-person shooter that will be fully unveiled next month at gamescom 2019. Disintegration is the debut title from V1 Interactive, the independent development studio co-founded in 2015 by Marcus Lehto, former creative director at Bungie and co-creator of Halo.
“The opportunity to create not only a new game but this entire studio has been exhilarating,” said Marcus Lehto, President and Game Director at V1 Interactive. “It is great to be able to share what this amazing team has been working on, and we can’t wait to introduce this new game that our team has built to the world next month.”
V1 Interactive is a studio I created with the main purpose of pulling together a small but healthy mix of seasoned and new game devs to build games in a collaborative, hands-on atmosphere where everyone plays a critical role, has a voice and takes ownership over what we’re creating together.
I spent over two decades building major franchises like Myth and co-creating the Halo universe from scratch along with helping build Bungie from a small studio to the giant it is today. But, as the studio grew, the people most responsible for the vision of projects became less involved in the creation, which is in direct opposition to how I prefer to operate.
So, I left on good terms with my old friends at Bungie and spent nearly two years exploring game ideas and deciding what I wanted to do next. I even entertained several high-level studio leadership offers at other studios before finally deciding to take on the creation of new game concepts by starting V1.
Fortunately, I found a few dedicated students willing to take a leap with me to build a prototype. We pitched it around and eventually entered into a great relationship with Private Division as our publisher. While the studio grew rapidly after that, one of our biggest challenges was keeping a check on all our amazing creative ideas so our smaller studio could manage the big concepts and make them a reality.
Building an AAA game in a Small Team
The vast amount of experience we have in making high-quality AAA games has made some things much easier. We’re in an era now where game developers have matured and fully understand the very complex production and constant evolution of making games. As a small shop, we have less middle management of tasks so people stay productive and more fluid.
Efficiency with anything we build is key, so finding the right tech to get the job done right was very important to us. We decided to license an engine, Unreal 4, rather than build one ourselves, and we use some of the best equipment out there — like our Vicon mocap setup — to give us great results with minimal effort.
Approach to Animation
I’m a big believer in taking several passes on every creative idea to find the nugget of what works best. This is true for nearly everything creative. So, iteration with animation is key to finding the best possible results.
With only two animators and one tech artist, we’re always searching for ways to achieve large volumes of high-quality animation that suit the needs of our project. We still hand key many aspects of core game animation, but utilizing a mocap system like Vicon’s helps us get 60-70% of the way there for many complex motions while saving us time.
Where mocap shines in our pipeline is for pre-visualization of our cinematics. The high fidelity mocap we get from the Vicon Vero cameras allows us to stand up performances quickly for iteration. Our cinematic lead can then easily shoot around these performances using UE4’s sequencer tools. Because the setup process to capture is so easy, we can experiment with acting choices without pressure and have video reference to boot, using the Vicon Vue’s video overlay. Being able to have that flexibility for changes and experimentation is key to us.
Help from VICON
Because we’re a small studio with a limited budget, we shopped around for several options and were very excited when we found that Vicon had a newer, affordable solution aimed specifically at studios like us. The other solutions we had been looking into were similar in price but didn’t offer the rich features we get with such an industry standard as Vicon. It was an easy choice for us.
Vicon has been a good partner to us. The satisfaction we had with the purchase negotiation, personalized attention, installation, and training was unprecedented. We were concerned that we would be able to afford such a quality system, but it didn’t take long for us to start using it regularly and realize how much it positively impacted our ability to efficiently create great content.
Now we throw on the mocap suits regularly just to mock up a shot or try out a new gameplay mechanic that requires some cool new moves. It’s so easy and effortless for us to use the system that I can’t imagine us not having it.
What Will Come in the Future?
I think the future of animation for games and even film is going to change dramatically over the next decade. Improved motion capture of actors and the means to attain high quality, clean, usable footage along with great tools to process it is inevitable. Where I think we’ll see the biggest changes is in the use of how our software will be able to interpret that mocap data for creating procedural AI-driven end results that still are actor- and artist-driven, but are far more responsive and believable to a player’s input or a director’s guidance.
Myrkur Games, Hazelight Studios, V1 Interactive, Game Studios
Interview conducted by Kirill Tokarev