Dynamic Early-Exit Convolutional Neural Networks for Edge Vision: Do only what and when you need!
Presentation Menu
While computer vision has seen unparalleled growth over recent years, mainly due to advancements in deep convolutional neural networks, deploying such models on edge devices such as tiny autonomous unmanned aerial and terrestrial vehicles has been challenging, due to constraints involving power and energy, computational resources and memory footprint, and application performance requirements. Applications in safety-critical missions such as search and rescue operations and emergency management, elevate these challenges, as they also impose performance constraints, in addition to visual constrains associated with the inference due to occlusions, changing environmental parameters, and dynamic operational and situational context. Current practice involves performing training of the CNNs in a way that generalizes to all these constraints and optimizing the model to target an embedded processing device, custom accelerator or an embedded multi-processor platform. While methods such as pruning (structured and unstructured) and quantization have achieved various levels of success in compressing and optimizing deep CNNs for embedded devices, they remain subject to the computational boundaries imposed by the host platforms. Leveraging from the benefits of adaptive inference in dynamic deep CNNs, in this talk I will present our recent work in dynamic deep convolutional neural networks targeting low power embedded computer vision applications. In particular, Dynamic Deep Convolutional Neural Networks (DNNs) offer significant resource savings over static CNNs in edge computer vision by adapting their computation based on input complexity. Unlike static models that process every input with a fixed depth and width, dynamic DNNs selectively activate layers, channels, or early exits, reducing unnecessary computation and memory access. This conditional execution leads to lower energy consumption, faster inference, and improved efficiency—making dynamic architectures ideal for real-time, resource-constrained edge applications.