Neuromorphic smart sensing for low-latency motion segmentation and analysis

Conventional image sensors rely on a frame-based encoding at a fixed frequency, which typically limits their temporal resolution to a few tens of milliseconds. Aiming to solve this challenge, “Neuromorphic smart sensing for low-latency motion segmentation and analysis” made possible by the PPP allowance from Holland High Tech started in October 2022. The project partners are TU Delft and Prophesee. Developping smart hardware that is able to process the high throughput of neuromorphic image sensors is an open research challenge within this project.

This project is co-funded by Prophesee, the global market leader in neuromorphic event-based vision sensing and processing for applications in industrial machine vision, mobile, AR/VR, IoT, surveillance and automotive. Headquartered in Paris, France, Prophesee currently has ~120 employees in hardware and software R&D, applications, sales and support functions. The company operates a second R&D facility in Grenoble, France and has sales and support subsidiaries in Japan, China and California.

About the project

In radical departure from frame-based processing, neuromorphic vision sensors provide an event-based encoding of temporal contrast changes at the pixel level, which allows successfully capturing microsecond-range dynamics. However, exploiting the fine-grained temporal dynamics captured by neuromorphic vision sensors is difficult. Indeed, offloading computation to the cloud loses the latency advantage of such sensors, conventional CPU/GPU architectures are ill-matched for spike-based event-driven processing, and current neuromorphic architectures are not optimized for low-latency vision applications.

A new concept and milestone

Within a year, the project has already delivered a new concept that allows for event-driven updates of deep convolutional neural network (CNN) models, a state-of-the-art machine-learning workload for computer vision, directly inside the memory that stores the network parameters and states. By exploiting sparsity in visual information while minimizing memory accesses, the proposed scheme reduces the power and latency requirements for deep CNNs at the edge. A first silicon prototype in 40nm CMOS has been sent to fabrication, with the test phase foreseen to start by May 2024. This prototype is aiming to enable complex computer vision tasks at the edge within 100µs-1ms and 1-10mW latency and power budgets, respectively. Future directions include expanding the proposed scheme to advanced recurrent neural networks (RNNs) for complex computer vision tasks across long temporal timescales, and porting computation near the sensor for further cutting the system-level power budget.

This successful first milestone of the project already led to further expansion with the submission of a second HTSM-TKI project co-funded by Prophesee, which will focus on investigating event-driven processing schemes for emerging machine learning algorithms such as graph neural networks (GNNs) in near-sensor edge applications.