Champion-level Drone Racing using Deep Reinforcement Learning (Nature, 2023)

May 04, 2024

Drone

racing

is an exciting, fast-paced sport in which pilots control small

drone

using

a live video feed from the

drone

's camera. The goal is to navigate the predefined course and go through all the doors in the correct sequence faster than anyone else. Competitors is a test of skill, strategy and quick thinking as pilots must maneuver their drones across the challenging race track at high speeds. In this work we present Swift, the first autonomous system that achieves

champion

level

performance in drone

racing

. Swift won multiple races against the

champion

s. from three different drone racing leagues and achieved the fastest race time on record, all by relying only on an onboard computer, a single camera, and an inertial sensor.

Swift uses visual inertial odometry, also known as Vio, to estimate its own position, speed and orientation, while such an approach produces accurate estimates when operating in near-hover conditions, these estimates degrade substantially at the speeds encountered during drum runs, as shown in green, to improve the quality of estimates in such agile regimes. Swift complements the Vio Pipeline's estimation with a detection system based on

learning

that racing door segments in images captured by the onboard camera given the position of a door in the world frame Swift uses the PNP algorithm to triangulate its location the animation shows one or more doors detected during a flight Swift uses a Kalman filter to combine the Vio Pipeline estimates with the pose estimate obtained from the detected doors, the uncertainty of the PNP algorithm estimate decreases quadratically with the distance towards a detected gate, as a result, the variance of the post-Kalman filter also decreases when approaching a gate.

More Interesting Facts About,

champion level drone racing using deep reinforcement learning nature 2023...

Racing a drone at high speeds through a complex track requires fast and precise control actions, this is especially challenging when

using

a camera for state estimation that produces noisy estimates of the drone's state to overcome this challenge. Swift uses a

learning

-based control policy that directly maps noisy state estimates to control commands that enable fast and accurate steering of the Drone. The control policy is represented by a neural network and is trained by

deep

reinforcement

learning in simulation during the training process. The control policy scans the environment using 100 agents in parallel and finds. increasingly faster paths through the track layout the entire training process takes less than an hour on a normal desktop workstation to prepare the control policy for the noisy and imperfect state estimates available in the real system Swift uses flight data collected in the real world to identify Residual models for perception and dynamics: These residual models capture effects that are difficult to model in simulation, such as vision-based estimator degradation or turbulent aerodynamic effects.

Swift uses this augmented simulation to fine-tune the control policy for real-world deployment. results in a highly competitive policy that can cope with the uncertainty of the real system. We compare Swift's performance against some of the best human pilots in the world. Swift raced against Alex vanover, the 2019 Drone Racing League world champion, Thomas bitmatta, the two. the time multigp international open world cup champion and Marvin Shopper, the three-time Swiss national champion. By comparing the trajectories flown by humans and Swift, we notice that the autonomous system is more consistent in turns and is capable of taking tighter turns, which gives it a decisive advantage in a race.

Swift won several races against each of the human champions and achieved the fastest race time on record. This is the first time an autonomous mobile robot has achieved world champion-