What has been done?
Maximizing real-time performances of Driver Monitoring System has been performed on the Qualcomm Snapdragon Automotive Platform. Driver Monitoring System has been developed in-house for the demonstration purposes. It was originally developed on the personal computer. Goal of this project was to achieve real-time performances on the embedded heterogenous platform, taking benefits of the processing cores – DSP.
How is it done?
Project has been organized in multiple phases following ASPICE development model, which include analysis (of the algorithm and embedded platform), implementation, testing and validation.
Driver monitoring application consist of multiple phases with dependencies between them:
- Capturing images (from the camera / file),
- Image pre-processing (grayscale & crop),
- Face detection (resize, Viola-Jones framework),
- Eyes and iris detection (custom algorithm utilizing concentric weight matrices), and
- Drowsiness score calculation (which triggers alerts and/or information to the driver).
On the other side, there is Qualcomm Snapdragon Automotive Platform as heterogenous platform with Qualcomm® Quad-core Kryo™ CPU, Hexagon™ 680 DSP and Adreno™ 530 GPU. Hexagon™ 680 DSP is a hardware multi-threaded, variable instruction length, VLIW processor architecture which provides Hexagon Vector Extensions (HVX).
During the deployment on the embedded platform multiple aspects were considered. Real-time performance is achieved by balancing following aspects:
- Mapping of the algorithm phases to the processing core based on the nature of algorithm operations and types of the processing cores,
- Memory optimizations for improved data sharing between processing cores,
- Optimization of most time-consuming algorithm processing phases on embedded platform – DSP optimization of face detection.
Driver Monitoring application phase mapping to the processing cores is shown on the image above.
Memory optimization has been done to avoid copies of data in different address spaces between processing units. Since Android Operating System executes on the platform, additionally ION heap was utilized for memory exchanged between CPU, DSP, GPU.
Face detection was the most time-consuming phase and is executed on DSP where additional DSP optimization techniques are applied, of which some are listed below, where first two gained biggest performance improvements:
- Branch elimination – using advantages of Hexagon DSP multiplexing instruction that can be used to choose one of two values depending on a given predicate – branch condition used as predicate and correct values selected using the multiplexing instruction.
- Usage of DSP assembly functions and DSP optimized libraries (e.g FastCV)
- Usage of inline functions – elimination of auxiliary instructions that are used to prepare stack for function call and to do clean-up afterwards
- Loop unrolling – using Hexagon LLVM compiler features optimizations.
Continuous monitoring of performances and accuracy was tracked during the project, utilizing Hexagon SDK tools/analyzers/simulators for memory resources and execution history. This is to confirm that accuracy stays same as in original Driver Monitoring System and to check and track performance improvements. Testing and verification has been performed in the laboratory environment with pre-recorded video streams and live camera streams.