NVIDIA's Richard Kerris, Vice President of Developer Relations and Head of Media and Entertainment, talked about how AI will influence the industry and discussed how artists can benefit from the new tech.
Richard Kerris, Vice President of Developer Relations, Head of Media and Entertainment
Intro
We saw 2023 as a major turning point when generative AI started to really produce beautiful imagery quickly and efficiently. I think some of the challenges that people have expressed about how do you control it, how do you craft it and fine-tune it, this is what the companies on our platforms have been really focused on. I think what we're going to see in the next year is a lot of focus on the tool aspect of AI to be used in production from the standpoint of gen AI in particular.
So one of the challenges right now is if you wanted to use generative AI in a feature film or content that you were creating, that's kind of hard to wrangle. You go in, you prompt it, you can fine-tune it a little bit, but you really don't get your hands in there to say, "I want to change this part or I want to change that part." But we're seeing that as what's being addressed by a lot of the companies. They're now allowing you to control various aspects of the imagery that you're making or the videos that are being produced. I think 2024 is gonna see this happen more and more so that it's not just a black box of what can I get, oh, I got a pretty picture. It's going to be more like a tool that you would use in production, and it's going to be a cost-saving tool.
One of the things I always hear about AI when people don't understand its full comprehension of what it can do is, is it going to take jobs, is it going to replace things, and stuff like that? No, not at all. I think it's going to open up a world of opportunities. It's going to help democratize some of the complexities of creating content to other parts of the world that have never had access to it, whether it's because they're doing it via the cloud, and they can learn that, or they can do it in easy-to-understand prompting. So, they can prompt, modify, and start to visualize and tell their story. So I'm like so many others here, very excited that we're just at the tip of the iceberg of what all this capability is going to mean for the content creators and code.
In general, one of the other aspects is to look at what this means from cost savings to writing software products. No, you won't write Photoshop by prompting a chat AI thing anytime soon, but what you will start to be able to do is smaller task-related things. I need a tool that allows me to do XYZ, and you'll be able to prompt, create, and start to use it on your platform.
I think that these two areas are the most exciting for the developer community out there and the content creators who use the development tools in their workflows. It's going to simplify many complexities, like rotoscoping. We're seeing companies like Mars, startups that specialize in de-aging if you're familiar with some of the de-aging that has been done.
It's been happening for many years. Lucasfilm, where I used to work, was doing de-aging back when I was there, but it was done manually. Rotoscopers were there to remove wrinkles and perform other tasks. Over time, they've been able to train their models on existing actors' content and be able to say, "I want this particular actor to be 20 years younger," as we saw in The Irishman.
Now, third-party companies are making similar tools commercially available using AI. This allows you to go in and specify, "I want the subtlety of this actor to be younger or older, or whatever you prefer." A company in our inception called Mars is involved in this space. We're also seeing other companies like Runway, offering generative AI tools accessible via the cloud. This means you can use them from your desktop, laptop, or even your mobile device and have access to powerful GPUs via the cloud to create content. These are just a few of the areas, and I could go on for hours and hours.
How much time we're going to save by using these new technologies?
I think a lot of them will start in the background, meaning that it'll give more attention to the foreground or the hero shots that are being worked on. But in a typical scene, you might have a car driving down a road, and there are other cars parked on the side of the road, and there are trees behind that, and there are buildings. And traditionally, you would have people go in and model all of that, or they would create plates for it or something like that.
Using AI tools, you'll be able to prompt that and say, "I need a bunch of parked cars, and I need this." And the computer will be able to go create that using these different software products that are out there on our platforms that allow for more attention and focus to be on the hero shots. So this is where some of the task-based tedious work, whether it's background imagery creation and content creation or even rotoscoping and other things that typically take a long time of somebody sitting at the screen frame by frame doing things.
If you can automate all of that using AI, you then tip the scale over to where the real focus is going to be on the main characters and the main creatures, etc., that you're going to have in your particular project. So, I think that's one where we'll see time savings, but we'll also see a higher degree of attention on those hero shots that are being worked on.
What are the blockers? Is it legal, or maybe there are some other technical issues that still do not allow the industry to embrace it?
Yeah, I think you touched on a couple of things there. The content that it's been trained on has to be artist rights-applied content, like we're doing with Getty and other companies where they train their models. They compensate the artists whose content was used to train those models and indemnify their audience or their customers who are going to be using that. We're seeing that happen with partners like Getty and Adobe and others, so that kind of removes the concerns that have been initially raised because we saw things trained just on the web and trained on other things that didn't have all of that consideration. That caused some concerns for people, not unlike when we saw synthesizers and sampling affect the music industry in the late '70s when everybody was going, "Wait, where did that drum beat come from? Where did that guitar riff come from?" Eventually, it worked out to where if somebody's going to use sampled content from something, then they will pay that artist for that content that they're using. We're going to see similar things happen in the AI space.
Some of the other aspects, when you were discussing certain elements, I wouldn't label them as blockers. I believe one of the challenges will be to exert more control over AI. An example I recently shared during a talk at Digital Hollywood a few weeks ago highlights this. I mentioned that we will witness widespread acceptance of AI when familiar tools within the creative community are supported by AI. For instance, I might capture footage in 35 millimeters but employ generative fill to transform it into a 70-millimeter film using AI. Such transformations are inevitable. Alternatively, I might adjust the lens or composition in post-production using AI. I believe that as tools like these become more refined and find their way into the hands of toolmakers and those integrating them into workflows, we will see widespread adoption. This evolution is reminiscent of synthesizers and other innovations embraced in the music industry, albeit after a learning phase. I predict that in two to three years, AI will be as commonplace in content creation as a mouse, brush, tablet, or any other tools in the process.
We recently bought a washing machine and a dryer from LG. They have an AI whatever mode – it weighs or senses the load and figures out how to wash it. So I think about AI there. It's like everywhere. It's like in your fridge and your phone.
We have this same LG washer and dryer, and we have the refrigerator that goes with it, and they all connect, and the same goes for the TV. So when the wash is done, a TV notification pops up and says your clothes are done. So yeah, but to your point, one of the benefits of it is if you put a big heavy blanket in the washing machine, it figures out the balance and uses knowledge too. Its AI-trained models say this is something that's going to be imbalanced unless it compensates for it. That's a great consumer use case of where AI comes into play. Another one that I like when people ask me about it in that same vein, I don't know if you have one of those robot vacuum cleaners, but we certainly do, and we love them because we have multiple dogs, and they clean. They go vacuum the house, but one of the things that people always bring up is, yeah, for the first few weeks of owning it, it just bangs around the house while it tries to map out the entire house and things like that. If you look at where we're going with digital twins of buildings and facilities like we're seeing at the high end with factories and things, that's going to come down into the consumer space where you'll have a digital twin of your house, and you'll go and buy your robot vacuum cleaner, and you will simply download the digital twin of the house to the vacuum cleaner and, by your voice, prompt the vacuum cleaner on what to do and what not to do and when to do it.
That's using AI. That's using digital twins, and that's using something that you'll benefit from immediately, right? Because it just will never bang into things. This helps me convey when I talk to people who ask me about how AI is going to impact me. I can take it from there, and then I can take it all the way back up into the high end of what we were talking about, de-aging a character in a film or perhaps putting actors in challenging positions in a film but not having to do that using digital doubles and things like that. So there's a wide spectrum of use cases, but in much the same way, the digitization of content from analog to digital is now just pervasive in everything. You have great high-end cameras in your mobile device. It hasn't replaced the photographer, the professional photographer. It's enhanced, and it's given more people the power of using AI to defocus or rack focus or do other things post-process when they take a picture on their mobile device or remove an object.
So, you know, it is one of these things because we're still at the early stages, and people are still experiencing the AHA moments that there's still a part where they have to understand what the real power and the potential will do for them in everyday life all the way through to the work that they do on a daily basis, whether you're in content creation, architecture, machinery, what have you. You're going to see AI as part of these workflows, and you're going to welcome it because it'll help you do your job better.
In this new future, where does NVIDIA stand? Where do you see yourself as a company?
We are now a platform, API, and services-based company. Our focus is on making technology accessible to our developers and ISVs, rather than creating end-user products with these technologies. While we may develop some reference examples to showcase the possibilities, our primary customer is the developer. Whether it's a large developer like Adobe, Autodesk, or others, or a new startup company in our Inception program, our role is to be the platform for these developers to realize their dreams. The beauty of our approach is that as we transition all our infrastructure to the cloud, anyone, anywhere can access the power of NVIDIA on any device at any time. This marks a significant shift from the constraints imposed by cost and location that hindered many developers for years. Now, developers worldwide will have unfettered access, and I can't imagine the fascinating creations that will emerge from this newfound accessibility.
We're used to selling into our own backyard of the technology industries, whether it's here or in various locations in the world, but it's usually been these milestone locations. That's all going to go away, and it's going to be much more democratized. You're going to have people in India, Africa, and all parts of the world that are going to be able to have access to this, and what will they create? I can't wait to see. It's going to be amazing. We will have removed barriers.
There are still a lot of non-believers in the community, especially if it's coming from traditional art because people are feeling super vulnerable and so on. But if you were to talk about the current professional tools that are out there, like whatever something Adobe is doing or some other companies, what would be your recommendations to have a look at?
The best recommendation I can give people, because when everything went digital, is to try it, play with it, have as much fun as you want. You're not going to run out of film; you're not going to run out of pixels. And that's what I do with every kind of technology I get my hands on, and I recommend it to both my team and other players is just start using it. If you want to read the manual, read the manual. If you want to just start poking around, do it. You're not going to harm anything; you're going to learn something, and you might get enlightened with how it could help you do a particular project. That's how I've been with photography. I've been a photographer all my life, and once it went digital, you just have much more opportunity to play and challenge yourself, whereas when it was analog, you're like, I'm going to run out of film, so I got to take this shot, be very careful about what I'm about to do because I only have a limited palette or a limited film roll. All that has changed when we went digital with AI. It's going to get even more so that you can really say, well, I want to see this shot if it was shot this way or maybe I want to see it in this style or maybe I want to try in that style. More opportunity to play, more opportunity to create, gives you more opportunity to tell your story.
One last question. What's your camera of choice?
I use a Leica M11, and I use a Leica SL2 depending on the project. I do a lot of music video and music photography. So when I do those kinds of shots, I like the automation that the SL2 provides. But if I'm doing standard portraits, I just love the manual of the M11.