Nvidia shows off Project GR00T, a multimodal AI to power humanoids of the future

8 Min Read

Join Gen AI enterprise leaders in Boston on March 27 for an exclusive night of networking, insights, and conversations surrounding data integrity. Request an invite here.


Nvidia is pushing the bar on robotics with the introduction of Project GR00T, a multimodal AI to power humanoids of the future with advanced foundation AI.

Demonstrated today during the GTC conference at the San Jose McEnery Convention Center, Project GR00T taps a general-purpose foundation model that enables humanoid robots to take text, speech, videos or even live demonstrations as input and process it to take specific general actions. It has been developed with the help of Nvidia’s Isaac Robotic Platform tools, including a new Isaac Lab for reinforcement learning.

“Building foundation models for general humanoid robots is one of the most exciting problems to solve in AI today,” Nvidia CEO Jensen Huang said in a statement. “The enabling technologies are coming together for leading roboticists around the world to take giant leaps toward artificial general robotics.”

To help enterprises run GR00T to success, the company has announced a dedicated Jetson Thor chip for humanoids. Plus, it has also shared some notable advancements for building AI-powered industrial manipulation arms as well as robots capable of navigating unstructured environments.

What to expect from Nvidia Project GR00T?

While the name looks similar to Marvel’s Groot, it actually stands for Generalist Robot 00 Technology. According to Nvidia, it has been designed to understand natural language text, speech, video and live demonstrations to emulate human movements — coordination, dexterity and other skills — and produce general actions to navigate, adapt and interact with the real world. 

See also  Meta releases Llama 3, claims it's among the best open models available

This will not only enhance the capabilities of humanoid robots but also make it very easy to develop and deploy them. Essentially, with text and demonstration as inputs, the robots can be programmed by any person (with relevant access).

In his GTC keynote, Huang demonstrated multiple GR00T-powered humanoid robots completing a variety of tasks, including those from Agility Robotics, Apptronik, Fourier Intelligence and Unitree Robotics. Deepu Talla, who gave journalists a briefing about GR00T, noted that the project leverages the latest greatest work in generative AI and transformers without sharing much on the full range of its capabilities.

Notably, OpenAI, which is one of the most prominent names in the generative AI space, is also working on embodied AI and has backed two startups in the domain: 1X Technologies and Figure. Just recently, Figure even released a video that showed one of its robots handling routine chores, such as picking up garbage with the help of a large vision-language model (VLM) trained by the Sam Altman-led research lab. Both companies are also working with Nvidia, the company has confirmed.

Project GR00T
Project GR00T, a general-purpose multimodal foundation model for humanoids, acts as the mind of robots, making them capable of learning skills to solve a variety of helpful tasks.

When reached out to by VentureBeat, Talla said the company cannot share additional details about the internal architecture but will have more to share on the capabilities side in the future. He also noted that only select humanoid developers, including those mentioned above, have early access to the model at present but they plan to expand its availability to more humanoid and other embodiments pretty soon. 

See also  U.S. VC funding hit lowest level in 6 years in Q3 | A story told in charts

To make sure humanoid robots can run complex multimodal models like GR00T, Nvidia has also launched the Jetson Thor computing platform for humanoids. Based on the company’s Thor SoC, the computer includes a high-performance CPU cluster and next-generation GPU based on the Nvidia Blackwell architecture with a transformer engine delivering 800 teraflops of 8-bit floating point AI performance. 

Talla said in the briefing that the system’s GPU performance is 8-fold better than the previous version, Jetson Orin, while CPU performance is 2.6 times better. 

To bring Project GR00T to life, Nvidia tapped its own Isaac Robotics Platform, which gives developers a powerful, end-to-end platform for the development, simulation and deployment of AI-powered robots. 

Specifically, the company said it leveraged its all-new Isaac Lab, based on Isaac Sim, to test and train the model through parallel simulations in a GPU-accelerated virtual environment as well as the OSMO compute orchestration service to concurrently manage the training and simulation workloads on Nvidia DGX and Nvidia OVX.

In addition to these capabilities, the Isaac Robotics Platform is getting two use-case targeted offerings — Isaac Manipulator and Isaac Perceptor.

Isaac Manipulator, as Talla explained, offers GPU-accelerated libraries and dedicated foundation models to help robotic arm manufacturers improve their products with state-of-the-art motion and dexterity. It includes models targeted at detecting objects, estimating their 6D pose, tracking them and even making dense predictions to grasp them.

Isaac Manipulator
Image depicting autonomous mobile robots (AMR) and a manipulator working together to enable AI-based automation in a warehouse, powered by NVIDIA Isaac. This automation supports various tasks related to manufacturing and logistics applications.

The Perceptor, on the other hand, takes up the task of guiding robots through unstructured environments with multi-camera, 360-degree vision capabilities — delivered via AI-based accelerated algorithms for 3D perception and surround vision. Nvidia is offering the technology through its Nova Orin DevKit and is already working with multiple partners, including ArcBest, BYD and KION Group, to help them advance their autonomous mobile robot functions in manufacturing and fulfillment.

See also  How AI and LLMs are revolutionizing cyber insurance

“Using the Isaac Perceptor platform in our Vaux Smart Autonomy AMR forklifts and reach trucks enables better perception, semantic-aware navigation and 3D mapping for obstacle detection in material handling processes across warehouses, distribution centers and manufacturing facilities,” Michael Newcity, chief innovation officer at ArcBest and president of ArcBest Technologies, said in a statement.

The new Isaac platform capabilities are expected to be available in the second quarter of this year, while Project GR00T remains in early access. Nvidia is accepting applications to give more humanoid developers access to the technology, but the timeline of broader public release remains unclear at this stage.



Source link

Share This Article
Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Please enter CoinGecko Free Api Key to get this plugin works.