A COMPUTATIONALLY EFFICIENT REAL-TIME VISION-BASED HUMAN TRACKING FRAMEWORK FOR OVERHEAD UAVS

 

Abdul Moati Diab

Mechatronics Engineering, MSc Thesis, 2026

 

Thesis Jury

Prof. Mustafa Ünel (Thesis Advisor)

Assoc. Prof. Kemalettin Erbatur

Assoc. Prof. Ali Fuat Ergenç

 

Date & Time: May 18th, 2026 – 10:00 AM

Place: FASS 2023

 

Keywords : real-time group tracking, computationally efficient tracking,

motion-based tracking, optical flow, overhead UAVs

 

 

Abstract

 

Tracking pedestrian groups from overhead UAV video is challenging because pedestrians appear as compact, low-texture regions with weak semantic structure, making appearance-based detection and tracking-by-detection pipelines unreliable under clutter, vehicle interaction, and stop-and-go motion. This thesis proposes a computationally efficient, real-time, motion-based group tracking framework that avoids explicit object detection and builds robust tracking from sparse optical flow and layered temporal reasoning. Shi-Tomasi feature points are initialized inside a region of interest and tracked using pyramidal Lucas-Kanade optical flow. Median-based statistics estimate group velocity. Momentum-based motion gating rejects kinematically inconsistent points, membership-based temporal association with sticky trust preserves support during halts, and coherence filtering removes spatially inconsistent points. These components are integrated with stop-invariant ROI estimation and localized reseeding. The framework is evaluated on 38 real-world overhead sequences from the Vehicle-Crowd Interaction dataset. The full system achieves a mean position error of 0.4228 m, RMSE of 0.4545 m, 95th-percentile error of 0.6759 m, worst-case error of 1.6685 m, and mean IoU of 0.8016. Relative to the 7.62 m average ground-truth ROI diagonal, this corresponds to 5.55%. Ablations show that momentum gating, membership modeling, and coherence filtering address complementary failure modes. YOLO+ByteTrack ran at comparable speed (34.41 vs. 32.83 FPS), but person tracks were available in only 187/1680 frames. CPU RAFT reached 0.45 FPS, while CPU KLT reached 33.45 FPS, a 74.33x speed-up. Together, these results show a lightweight, continuous tracker suited to online UAV use, unlike the discontinuous YOLO+ByteTrack baseline and CPU-heavy RAFT.