Skip to main content

D-Robotics

The USX51's computing power flight controller is composed of D-Robotics' RDK X5 Module, which provides 10 TOPS INT8 BPU AI computing power and 21.6 DMIPS/MHz CPU general computing power. D-Robotics has launched an intelligent algorithm package based on TogetheROS.Bot for robot manufacturers and ecosystem developers, aiming to improve the efficiency of integrating and deploying robot intelligent algorithms based on the D-Robotics RDK robot operating system.

Algorithm

The following is a summary table of some algorithm examples:

ClassificationSpecific Algorithm
Object detectionFCOS、YOLO、MobileNet_SSD、EfficientNet_Det、YOLO-World、DOSOD
Image classificationmobilenetv2
Image segmentationmobilenet_unet, Ultralytics YOLOv8-Seg, EdgeSAM splits everything, MobileSAM splits everything
Human body recognitionHuman detection and tracking, hand landmark detection, gesture recognition, face age detection, 106-point face landmark detection, human instance tracking, human detection and tracking (Ultralytics YOLO Pose), hand landmark and gesture recognition (mediapipe).
Spatial perceptionMonocular elevation network detection, monocular 3D indoor detection, visual-inertial odometry calculation method, binocular depth algorithm, binocular OCC algorithm
Intelligent voiceIntelligent voice, Sensevoice
Generative large modelsllama.cpp, InternVL3, Smolvlm2, Qwen2.5
Map NavigationSLAM-Toolbox mapping, Navigation2
Interactive controlPose detection, human body tracking, gesture control, voice control, voice tracking
Smart terminalRTSP video smart box, visual and voice box
Other algorithmsCLIP text and image feature retrieval, Mono-PwcNet optical flow estimation

YOLO

This YOLO object detection algorithm example uses images as input, leverages the BPU for algorithm inference, and publishes an algorithm message containing the object category and detection bounding boxes. Currently, it supports YOLOv2, YOLOv3, Ultralytics YOLOv5, YOLOv5x, Ultralytics YOLOv8, and YOLOv10.

ModelplatformInput dimensionsInference frame rate (fps)
yolov2X51x608x608x338.33
yolov3X51x416x416x331.28
yolov5X51x512x512x310.37
yolov8nX51x3x640x640140.46
yolov10nX51x3x640x64036.47

Code repository : https://github.com/D-Robotics/hobot_dnn

The model is trained using the COCO dataset and supports 80 types of object detection, including people, animals, fruits, and vehicles.

You can also use the Ultralytics software package to train on custom datasets. https://docs.ultralytics.com/zh/modes/train

Custom model quantization

Currently, most models trained on GPUs are floating-point models, meaning that the parameters are stored using the float type; the processors in the D-Robotics BPU architecture use INT8 computational precision (the common precision of processors in the industry), and can only run fixed-point quantization models.

The D-Robotics algorithm toolchain primarily uses the Post-Training Quantization (PTQ) method. It only requires a batch of calibration data to calibrate the trained floating-point model, directly converting the trained FP32 network into a fixed-point computation network. During this process, no training is required on the original floating-point model; the quantization process can be completed by adjusting only a few hyperparameters. The entire process is simple and fast, and it is currently widely used in edge and cloud scenarios.

For more information, please refer to the D-Robotics Algorithm Toolchain Development Guide: https://developer.d-robotics.cc/rdk_doc/Advanced_development/toolchain_development/overview

Seeking help with a problem

Before starting drone AI algorithm development, it is recommended to read the official D-Robotics community documentation in detail: https://developer.d-robotics.cc/rdk_doc/RDK

If you encounter any problems while using the AI algorithm toolchain, you can search for or create relevant question threads on the D-Robotics official forum : https://forum.d-robotics.cc/.