logo80lv
Articlesclick_arrow
Research
Talentsclick_arrow
Events
Workshops
Aboutclick_arrow
profile_loginLogIn

Voyager: GPT-4-Powered Lifelong Learning Agent for Minecraft

"It continuously improves itself by writing, refining, committing, and retrieving code from a skill library."

Researchers from NVIDIA presented Voyager – the first LLM-powered embodied lifelong learning agent that plays Minecraft in context. It "continuously explores the world, acquires diverse skills, and makes novel discoveries without human intervention."

Voyager consists of three key components:

  1. an automatic curriculum that maximizes exploration,
  2. an ever-growing skill library of executable code for storing and retrieving complex behaviors,
  3. a new iterative prompting mechanism that incorporates environment feedback, execution errors, and self-verification for program improvement.

"Voyager interacts with GPT-4 via blackbox queries, which bypasses the need for model parameter fine-tuning. ... Empirically, Voyager shows strong in-context lifelong learning capability and exhibits exceptional proficiency in playing Minecraft. It obtains 3.3x more unique items, travels 2.3x longer distances, and unlocks key tech tree milestones up to 15.3x faster than prior SOTA. Voyager is able to utilize the learned skill library in a new Minecraft world to solve novel tasks from scratch, while other techniques struggle to generalize."

AI scientist Jim Fan explained how the AI works:

"First, Voyager attempts to write a program to achieve a particular goal, using a popular Javascript Minecraft API (Mineflayer). The program is likely incorrect at the first try. The game environment feedback and javascript execution error (if any) help GPT-4 refine the program.

Second, Voyager incrementally builds a skill library by storing the successful programs in a vector DB. Each program can be retrieved by the embedding of its docstring. Complex skills are synthesized by composing simpler skills, which compounds Voyager’s capabilities over time.

Third, an automatic curriculum proposes suitable exploration tasks based on the agent’s current skill level & world state, e.g. learn to harvest sand & cactus before iron if it finds itself in a desert rather than a forest.

Putting these all together, here’s the full data flow design that drives lifelong learning in a vast 3D voxel world without any human intervention."

Find the open-source project here and don't forget to join our 80 Level Talent platform and our Telegram channel, follow us on Instagram and Twitter, where we share breakdowns, the latest news, awesome artworks, and more.

Join discussion

Comments 0

    You might also like

    We need your consent

    We use cookies on this website to make your browsing experience better. By using the site you agree to our use of cookies.Learn more