Progettazione e sviluppo di algoritmi basati su deep learning per il rilevamento ed inseguimento di oggetti da immagini ottenute da drone

The related work was carried out during a curricular internship at the company MBDA Italia s.p.a Europe's leading designer and manufacturer of missiles and missile systems. The aim of this work is to study solutions for the detection and tracking of objects acquired by drones using Deep Learning approaches. Particular attention was paid to studying the relationship between the accuracy and inference speed of the models, in order to run them on embedded boards and test their benefits in Real Time. The dataset chosen for this work is VisDrone 2019, it represents a challenge, still completely open due to its enormous difficulties derived from the high unbalanced of classes and the size of the objects. In the first part of the project we will look at the results obtained for Multi Object Detection through YOLOv5 (You Only Look Once) and compare them with the state-of-the-art results for the VisDrone 2019 Challenge. The second part of the project focuses on solving the problem of Multi Object Tracking of images taken by drones. For this purpose, a cascade approach was adopted between two algorithms, the first of which uses YOLOv5 for object detection, while the second (strongSORT) takes the output of YOLOv5 as input and creates traces for each detected object by assigning it an ID. This approach was tested first on a single person tracking task taken from a drone, and then on video sequences from the VisDrone Tracking dataset, achieving good results. The last part of the project is focused on the deployment of the Object Detection algorithms on the NVIDIA Jetson TX2 board, in order to compare the execution times and study the behaviour of the networks in Real Time.

Il lavoro è stato svolto durante un tirocinio curriculare presso l'azienda MBDA Italia s.p.a, leader europeo nella progettazione e produzione di missili e sistemi missilistici. L'obiettivo di questo lavoro è studiare soluzioni per il rilevamento e il tracciamento di oggetti acquisiti da droni utilizzando approcci di Deep Learning. Particolare attenzione è stata posta nello studio della relazione tra l'accuratezza e la velocità di inferenza dei modelli, al fine di eseguirli su schede embedded e testarne i benefici in tempo reale. Il dataset scelto per questo lavoro è VisDrone 2019, che rappresenta una sfida ancora completamente aperta per le sue enormi difficoltà derivanti dall'elevato sbilanciamento delle classi e dalle dimensioni degli oggetti. Nella prima parte del progetto esamineremo i risultati ottenuti per il rilevamento di più oggetti attraverso YOLOv5 (You Only Look Once) e li confronteremo con lo stato dell'arte per la Challenge VisDrone 2019. La seconda parte del progetto si concentra sulla risoluzione del problema dell'inseguimento di più oggetti ripresi da droni. A tal fine, è stato adottato un approccio a cascata tra due algoritmi, il primo dei quali utilizza YOLOv5 per il rilevamento degli oggetti, mentre il secondo (strongSORT) prende in input l'output di YOLOv5 e crea tracce per ogni oggetto rilevato assegnandogli un ID. Questo approccio è stato testato prima su un'attività di tracciamento di una singola persona ripresa da un drone e poi su sequenze video tratte dal dataset VisDrone Tracking, ottenendo buoni risultati. L'ultima parte del progetto è incentrata sull'implementazione degli algoritmi di rilevamento degli oggetti sulla scheda NVIDIA Jetson TX2, al fine di confrontare i tempi di esecuzione e studiare il comportamento delle reti in tempo reale.