LingBot-World - An open-source interactive world model from Ant Lingbo Technology

LingBot-World is an open source interactive world model developed by Ant Lingbo Technology. The model learns physical laws and causal relationships from large-scale game environments through a scalable data engine to achieve precise action-driven generation. The model supports continuous and stable generation for nearly 10 minutes, with a response speed of 16 FPS and a delay controlled within 1 second. It also has Zero-shot scene generalization capabilities. The model effectively solves the pain points of scarcity and high cost of real-world training data. It can be widely used in robot training, autonomous driving simulation and game development, allowing agents to learn by trial and error safely and efficiently in a virtual environment.

Main functions of LingBot-World

High-fidelity interactive generation : Supports action-driven refined generation, accurately responds to user instructions, and renders dynamic scenes that are physically realistic.
long term consistency : The model can achieve continuous and stable generation for nearly 10 minutes, maintain object persistence and scene structure integrity, and solve the problem of “long-term drift”.
Real-time closed loop control : The model can achieve 16 FPS generation throughput, end-to-end latency is less than 1 second, and supports real-time keyboard and mouse control of characters and perspectives.
world event trigger : Through text commands, environmental changes such as weather and style can be dynamically adjusted to maintain consistent geometric relationships.
Zero-shot generalization : Input a single image to generate an interactive video stream without training for specific scenarios.

LingBot-World’s technical principles

Extensible data engine : Integrate online video cleaning and Unreal Engine synthesis pipelines to extract pure images without UI interference from the rendering layer, simultaneously record operating instructions and camera poses, and provide accurately aligned training signals for the model to learn “how actions change the environment.”
Multi-stage training strategy : The model enhances contextual memory capabilities through phased optimization and parallelization acceleration, achieving continuous and stable generation for nearly 10 minutes, maintaining object permanence and scene structure integrity.
causal distillation : Compress physical laws and causal logic into the model, ensuring that the model deeply understands the causal relationship between actions and results while maintaining 16 FPS real-time reasoning performance.

LingBot-World project address

Project official website ：https://technology.robbyant.com/lingbot-world
GitHub repository ：https://github.com/Robbyant/lingbot-world
HuggingFace model library ：https://huggingface.co/collections/robbyant/lingbot-world
technical paper ：https://github.com/robbyant/lingbot-world/blob/main/LingBot_World_paper.pdf

Application scenarios of LingBot-World

Embodied intelligence training : Provide a low-cost, high-fidelity virtual “drill field” for robots, support trial-and-error learning of complex long-range tasks, and solve the pain points of high cost and high risk of real-world data collection.
Autonomous driving simulation : The model can improve model generalization capabilities through dynamic changes in lighting, weather, etc., and reduce actual vehicle testing costs and safety risks.
game development : As a playable real-time simulator, it supports developers to quickly generate interactive content and realize dynamic world events and stylized rendering.
VR/AR simulation : Provide a low-latency, high-fidelity immersive environment for virtual training, digital twins and human-computer interaction research. ©

LingBot-World - An open-source interactive world model from Ant Lingbo Technology

Main functions of LingBot-World

LingBot-World’s technical principles

LingBot-World project address

Application scenarios of LingBot-World

You May Also Like

EvoMap - The first open-source network protocol for experience sharing in AI agents

Kilo CLI 1.0 - An open-source command-line tool from Kilo Code

BoMian - An AI-powered interview preparation tool that supports in-depth AI-driven questioning and answering

IronClaw - An open-source local security AI assistant from the NearAI team