A team of researchers led by the University of California San Diego has achieved a significant breakthrough in training four-legged robots to enhance their 3D vision. This advancement has enabled a robot to navigate challenging terrains effortlessly, including stairs, rocky surfaces, and paths with gaps, while effectively overcoming obstacles.
The researchers will be showcasing their work at the upcoming 2023 Conference on Computer Vision and Pattern Recognition (CVPR) in Vancouver, Canada, scheduled from June 18 to 22.
According to Xiaolong Wang, the senior author of the study and a professor of electrical and computer engineering at UC San Diego Jacobs School of Engineering, this new model equips the robot with a better understanding of its surroundings in 3D. Consequently, it becomes capable of operating in more complex real-world environments.
The robot is equipped with a forward-facing depth camera positioned on its head, which is angled to provide a comprehensive view of both the scene ahead and the terrain beneath it.
To enhance the robot’s 3D perception, the researchers developed a model that translates 2D images captured by the camera into 3D space. This is accomplished by analyzing a short video sequence comprising the current frame and a few previous frames. The model extracts relevant 3D information from each 2D frame, including details about the robot’s leg movements such as joint angles, joint velocities, and distance from the ground. By comparing this information between the previous frames and the current frame, the model estimates the 3D transformation between the past and the present.
By leveraging these techniques, the researchers have significantly improved the robot’s ability to perceive and understand its environment in 3D, allowing it to autonomously traverse challenging landscapes while effectively navigating obstacles along the way.
Researchers at the University of California San Diego have made significant progress in training four-legged robots to improve their 3D vision. The key advancement lies in a model that effectively synthesizes previous frames based on the current frame captured by a forward-facing depth camera on the robot’s head. By comparing the synthesized frames with the actual frames, the model refines its understanding of the 3D scene and achieves accurate representation.
This 3D representation plays a crucial role in controlling the robot’s movements. By leveraging the synthesized visual information, the robot can retain a short-term memory of its surroundings and past leg movements. This memory informs its subsequent actions and enables the robot to navigate challenging terrains while recalling what it has observed.
According to Xiaolong Wang, the lead author of the study, their approach empowers the robot with a more comprehensive understanding of its 3D environment, leading to improved performance.
The researchers’ work builds upon their prior efforts, where they combined computer vision with proprioception (the sense of movement, direction, speed, location, and touch) to enable a four-legged robot to navigate uneven terrain and avoid obstacles. The latest advancement lies in enhancing the robot’s 3D perception, which, when combined with proprioception, allows the robot to conquer more challenging landscapes.
The researchers note that their current model lacks the ability to guide the robot towards specific goals or destinations. Currently, the robot follows a straight path and avoids obstacles by deviating onto another straight path. Future work aims to integrate planning techniques and complete the navigation pipeline.
The paper’s co-authors include Ruihan Yang from UC San Diego and Ge Yang from the Massachusetts Institute of Technology.