Project Details
Neural Approximate Accelerator Architecture Optimization for DNN Inference on Lightweight FPGAs (NA^3Os)
Subject Area
Computer Architecture, Embedded and Massively Parallel Systems
Term
since 2023
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 524986327
Deep learning has changed the way and the quality complex technical problems have been solved. Many of the advances from mostly within the last decade already had profound impact on systems used in everyday life. Background is that technical systems are often to such an extent complex that it is infeasible to build sufficiently accurate models that may serve as a basis for classical optimization techniques. These are the scenarios where Deep Neural Networks(DNNs) shine. The drawback, however, are high computational demands to process DNNs. Besides high computing requirements, memory resources and energy, among others, are often also very high. This proposal presents an approach to successfully deploy DNNs in systems with very limited resources, particularly FPGAs, enabling thus efficient TinyML implementation. An emphasis in our investigations is put on a unique amalgam (combination) of compression techniques such as pruning and quantization with emerging approximate computing principles. Particularly for FPGAs, we want to investigate the opportunity of approximate arithmetic units. Moreover, we want to exploit FPGA-specific artifacts such as DSPs and BRAMs to provide highly resource and energy efficient hardware implementations for DNNs. To the best of our knowledge, this proposal presents the first important steps in optimizing the deployment of DNNs on approximate and reconfigurable hardware. This involves investigating innovative mapping and design space exploration techniques. The combination of micro-architectural peculiarities with the approximate computing paradigm promises a controllable trade-off between the quality of DNN results and the computational resources needed. The final goal is the development of a co-search methodology between the neural network architecture, its optimization, and the synthesis of approximated DNN accelerators on FPGAs. Further research includes the analysis of DNN robustness and energy tradeoffs. In summary, we propose the first steps to successfully deploy DNN on highly resource-constrained FPGA systems while exploiting approximate computing principles.
DFG Programme
Research Grants