Google's DeepMind robotics team has introduced cutting-edge AI systems, leveraging large language models (LLMs), to enhance the development of versatile robots for everyday use. The technology giant has revealed AutoRT, SARA-RT, and RT-Trajectory systems, aimed at improving real-world robot data collection, speed, and generalization.

COMMERCIAL BREAK
SCROLL TO CONTINUE READING

In a statement, the Google DeepMind team highlighted the significance of these advancements, stating, "We're announcing a suite of advances in robotics research that bring us a step closer to this future. AutoRT, SARA-RT, and RT-Trajectory build on our historic Robotics Transformers work to help robots make decisions faster, better understand, and navigate their environments."

AutoRT exploits the potential of large foundation models, crucial for creating robots capable of comprehending practical human goals. By accumulating more experiential training data, AutoRT aims to scale robotic learning, enabling better preparation for real-world scenarios. The system combines large foundation models, such as a Large Language Model (LLM) or a Visual Language Model (VLM), with a robot control model (RT-1 or RT-2), creating a system that deploys robots to collect training data in novel environments.

The Self-Adaptive Robust Attention for Robotics Transformers (SARA-RT) system transforms Robotics Transformer (RT) models into more efficient versions. According to the DeepMind team, the best SARA-RT-2 models exhibited 10.6% higher accuracy and 14% faster performance than RT-2 models, showcasing the first scalable attention mechanism providing computational improvements without sacrificing quality.

Additionally, the RT-Trajectory model automatically adds visual outlines describing robot motions in training videos. By overlaying 2D trajectory sketches on each video in the training dataset, RT-Trajectory provides low-level visual hints to the model as it learns robot control policies. In testing on 41 previously unseen tasks, an arm controlled by RT-Trajectory demonstrated over double the performance of existing state-of-the-art RT models, achieving a task success rate of 63% compared to 29% for RT-2.

The team emphasized RT-Trajectory's adaptability, as it can create trajectories by observing human demonstrations or accepting hand-drawn sketches, and can easily be tailored to different robot platforms. The comprehensive real-world evaluations conducted over seven months showcased AutoRT's ability to safely coordinate up to 20 robots simultaneously, gathering a diverse dataset comprising 77,000 robotic trials across 6,650 unique tasks in various office buildings. Overall, these innovations mark significant strides in the realm of robotics, bringing us closer to a future where intelligent robots seamlessly navigate complex environments.

(With input from IANS)