Close

Presentation

Maintaining Sanity: Algorithm-based Comprehensive Fault Tolerance for CNNs
DescriptionAs the deployment of neural networks in safety-critical applications proliferates, it becomes imperative that they exhibit consistent and dependable performance amidst hardware malfunctions. Several protection schemes have been proposed to protect neural networks, but they suffer from huge overheads or insufficient fault coverage. This paper presents Maintaining Sanity, a comprehensive and efficient protection technique for CNNs. Maintaining Sanity extends the state-of-the-art algorithm-based fault tolerance for CNN, utilizing hamming codes and checkpointing to correct over 99.6% of critical faults with about 72% runtime overhead and minimal memory overhead compared to traditional triple modular redundancy (TMR) techniques.
Event Type
Research Manuscript
TimeTuesday, June 2511:15am - 11:30am PDT
Location3012, 3rd Floor
Topics
Embedded Systems
Keywords
Time-Critical and Fault-Tolerant System Design