Project Details
Projekt Print View

System-Physician-on-a-Chip (SPOC): Chip Health-Monitoring Infrastructure IP and Run-Time Adaptation

Subject Area Computer Architecture, Embedded and Massively Parallel Systems
Electronic Semiconductors, Components and Circuits, Integrated Systems, Sensor Technology, Theoretical Electrical Engineering
Term from 2015 to 2021
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 269744693
 
Final Report Year 2021

Final Report Abstract

Design-time solutions and guard-bands for resilience are no longer sufficient for nanoscale integrated circuits (ICs). Each chip, due to process variations, is born with a unique personality (“nature”), and because of operating conditions, environment, and workload, grows uniquely (“nurture”). This project is motivated by the need to guarantee that each system, despite different nature and nurture, has an acceptable behavior (“resilience”). Resilience has been defined as the persistence of performance level that can justifiably be trusted in the presence of change. Hence static solutions based on pre-determined adaptation strategies cannot provide adequate resilience as systems evolve with time. While today’s ICs incorporate a large number of sensors (thermal, voltage, delay, etc.) for runtime monitoring, breakthroughs are needed to extract useful information from sensor data, perform real-time analysis, and make decisions about online adaptation. Appropriate reasoning methods are also needed to deal with inconsistent or contradictory sensor data due to stress-, process-, and workload-induced spatiotemporal variations. It is important to predict system state so that countermeasures can be taken before a failure occurs. The proposed research is focused on data-driven techniques for guiding dynamic adaptation policies. This level of dynamic decision-making and prediction-based control is a significant step forward towards resilient systems. The intellectual merit lies in the advancement of data analytics solutions for reasoning about on-chip behavior, the integration of prediction-based adaptation, and the update of adaptation strategies based on success, or lack thereof, of past adaptation decisions. This project leads to a health-monitoring infrastructure IP for a system to respond to changes in behavior occurring at different time scales. A hybrid hardware/software implementation is considered. For real-time decisions (e.g., response to voltage droop), the IP is designed purely in hardware.

Publications

 
 

Additional Information

Textvergrößerung und Kontrastanpassung