Shapefin

Robbyant Open-Sources LingBot-Depth Spatial Perception Model, Partners with Orbbec for Enhanced Robotic Vision

Share It:

Robbyant, an embodied AI company within Ant Group, has open-sourced its LingBot-Depth model, a high-precision spatial perception technology designed to improve robots’ depth sensing and 3D environmental understanding, while simultaneously announcing a strategic partnership with Orbbec to integrate the model into Orbbec’s next-generation depth cameras.

The LingBot-Depth model addresses a long-standing challenge in depth sensing systems: accurately capturing data from transparent or highly reflective surfaces such as glass, mirrors, and polished metal. These surfaces often lead to missing depth information or significant noise, which can cause robots to misidentify objects or miscalculate distances, posing operational and safety risks. Robbyant tackled this issue through its Masked Depth Modeling (MDM) approach.

MDM enables LingBot-Depth to analyze RGB image features, including texture, object contours, and scene context, when depth inputs are incomplete or corrupted. This analysis allows the model to infer and reconstruct missing regions, resulting in denser, more accurate, and sharper 3D depth maps. The model’s performance improvements are achieved without requiring any changes to current sensor form factors, highlighting its compatibility with existing hardware.

In benchmark evaluations, LingBot-Depth demonstrated superior performance compared to models like PromptDA and PriorDA on NYUv2 and ETH3D datasets. It achieved a relative error (REL) reduction of over 70% in indoor scenes and reduced the Root Mean Square Error (RMSE) by approximately 47% on the challenging sparse Structure-from-Motion (SfM) task.

The development of LingBot-Depth was closely supported by Orbbec, a leading provider of robotics and AI vision, which contributed hardware resources and technical expertise. The model underwent co-optimization on Orbbec’s platforms, with its performance validated by Orbbec’s Depth Vision Laboratory. Specifically, LingBot-Depth leverages chip-level raw depth data from Orbbec’s Gemini 330 stereo 3D cameras to intelligently reconstruct missing information, enhancing robots’ perception robustness and task success rates in complex optical environments. The Gemini 330, featuring Orbbec’s proprietary MX6800 depth engine chip, combines active and passive imaging for reliable 3D data across diverse lighting conditions.

To train LingBot-Depth, Robbyant compiled a dataset of approximately 10 million raw samples and curated 2 million high-quality RGB-depth pairs, optimized for extreme and ambiguous conditions. Robbyant plans to open-source this dataset in the near future to encourage further innovation in spatial perception.

Zhu Xing, Chief Executive Officer of Robbyant, commented on the initiative, stating, “Reliable 3D vision is critical to the advancement of embodied AI. By open-sourcing LingBot-Depth and collaborating with hardware pioneers like Orbbec, we aim to lower the barrier to advanced spatial perception and accelerate the adoption of embodied intelligence across homes, factories, warehouses, and beyond.”

Len Zhong, Head of Product Management at Orbbec, added, “Robbyant’s work in spatial intelligence models and algorithms complements Orbbec’s expertise in 3D vision chips and robotic vision systems. In this collaboration, the chip-level depth data provided by the Gemini 330 delivers a stable, high-fidelity, and physically grounded data foundation for the LingBot-Depth model. It’s a great example that demonstrates close coupling between a robot’s sensing hardware and its perception intelligence.”

Robbyant intends to broaden its reach by sharing its spatial perception capabilities with a wider ecosystem of hardware partners, aiming to facilitate the real-world deployment of intelligent robots in dynamic and complex settings.

Latest Posts