Lectura de Tesis de Yamilka Toca Díaz

El martes 13 de mayo a las 10h. en el Seminario del Departamento, en el edificio Ada Byron, tendrá lugar la defensa de la tesis de Yamilka Toca Díaz, "On Microarchitectural Mechanisms to Tolerate Permanent Faults in CNN Accelerators Supplied at Low Voltage", dirigida en nuestro Departamento por los Doctores Rubén Gran Tejero y Alejandro Valero Bresó.

 

Resumen de la tesis:

Aggressively reducing the supply voltage (Vdd) below the minimum safe threshold voltage (Vmin) has emerged as a promising strategy for achieving substantial power savings in digital CMOS circuits. However, such a voltage underscaling introduces significant challenges in maintaining system reliability, primarily due to the increased probability of permanent transistor faults caused by manufacturing process variations in advanced technology nodes. These faults can severely impact any computing device, including modern accelerators for deep learning applications. This dissertation investigates the impact of permanent faults on the accuracy of Convolutional Neural Network (CNN) inference accelerators that utilize on-chip activation memories operated at Vdd levels significantly below Vmin. Through a comprehensive characterization of fault patterns under these conditions, this work identifies various microarchitectural opportunities to mitigate the adverse effects of such faults on CNN accuracy while achieving notable energy savings. In particular, this thesis introduces two novel and low-cost microarchitectural techniques, Flip-and-Patch and Shift-and-Safe, designed to enhance the fault resilience of CNN accelerators operating under faulty and ultra-faulty supply voltages, respectively. The Flip-and-Patch technique proposes a flip-based data representation for activations with a low number of faults and exploits a small fault-free backup storage for activations with a high number of faults. To maintain CNN accuracy under more aggressive operational conditions, the Shift-and-Safe technique builds on Flip-and-Patch, incorporating a shift-based data representation and leveraging idle memory regions of the accelerator. Both techniques are transparent to the programmer and do not require application-specific profiling efforts, making them practical for real-world implementation of CNN accelerators. The key idea behind both techniques is that CNN accuracy exhibits a certain resilience to deviations in stored activation values during inference, making it unnecessary to restore exact values. Instead, the focus is on minimizing the deviation from the

intended value. Experimental results show that Flip-and-Patch, supplying activation memories at a faulty voltage of 0.54 V, sustains the original accuracy of CNNs with a minimal impact

on system performance (less than 0.05% for every CNN application), achieving average energy savings by 6.2% and 54.0% in on-chip activation memories compared to conventional memories operating at Vmin (0.6 V) and nominal (0.9 V) voltages, respectively. In addition, Shift-and-Safe preserves the original CNN accuracy even when nearly a quarter of the total activations are faulty at an ultra-faulty Vdd value of 0.52 V. In this case, the average system performance degradation is only by 1.6% and the average energy savings are by 4.9% and 11.1% compared to Flip-and-Patch and a conventional memory supplied at Vmin, respectively. The energy savings scale up to 58.9% compared to supplying memories at nominal 0.9 V.

Overall, the experimental results demonstrate that the proposed techniques enable significant energy savings while preserving the accuracy of CNN applications and the system performance of CNN accelerators, contributing to the design of energy-efficient and reliable neural network hardware accelerators in next-generation artificial intelligence systems.