Notes from the Wired

One-Shot Online Testing of Deep Neural Networks Based on Distribution Shift Detection

May 7, 2026 | 632 words | 3min read

Paper Title: One-Shot Online Testing of Deep Neural Networks Based on Distribution Shift Detection

Link to Paper: https://arxiv.org/abs/2305.09348

Date: 16. May 2023

Paper Type: Neuromorphics, Circuits, Fault Detection

Short Abstract: The paper proposes a “one-shot” online testing method for deep neural networks (NNs) implemented on memristor-based computation-in-memory (CiM) hardware. The goal is to detect hardware faults or analog variations using only a single test input (one forward pass). Instead of checking individual weights or using many test patterns, the method detects changes in the output distribution of the network.

1. Introduction

Deep learning accelerators built with memristive devices (e.g. Resistive Random-Access Memory (ReRAM), Phase Change Memory (PCM), Spin-Transfer Torque MRAM (STT-MRAM)) are:

But they suffer from:

These faults can degrade inference accuracy, which is critical in safety applications. Terefore, testing such hardware systems is crucial to ensuring their reliability and correct functionality, especially in safety-critical applications.

Traditional approaches:

This paper proposes a new one-shot testing method, in which the NN can be treated like a black box, that is fast, needs only a single testing vector and archives consistently 100% test coverage.

2. Preliminaries

Memristive devices can be arranged into crossbar arrays, with each cross point consisting of a memristive device. Therefore, the weighted sum computation required for the inference stage of the NN can be carried out directly in the memory by leveraging Ohm’s Law (V = IR) and Kirchhoff’s Current Law at a constant O(1) time without any data movement between the processing element and the memory.

Ultimately, an Analog-to-Digital Converter (ADC) circuit digitizes the sensed currents. After that, other computations in the digital domain, e.g., bias addition, batch normalization, and non-linear activation operations are undertake.

Mapping a layer of NN into a memristor-based crossbar array

3. Methodology

3.1 Motivation

Observation:

Faults in memristive weights → changes in activations → changes in final output distribution

So instead of testing weights directly, they:

  1. Feed a single specially optimized input
  2. Observe the NN output distribution
  3. Compare it to an expected “normal” distribution

3.2 Deviation Detection Method

Force the NN output to behave like a unit Gaussian distribution:

This is done using a specially optimized input vector. This makes it easy later to detect deviation from it.

They compute divergence between:

using Kullback–Leibler divergence:

3.3 One-shot test vector generation

They train a single input image-like tensor using gradient descent so that:

Key properties:

The “test vector” is: an input image-like tensor (shape like [H, W, C]), initialized randomly (Gaussian noise or even real stock images), then optimized using gradient descent. The goal of the training is not accuracy, but forcing the output to become standard Gaussian-like structure. The network is not trained, the input is trained.

Process:

  1. Start with random input image \( \bar{x} \)
  2. Forward pass through the network → get output \( \hat{y} \)
  3. Compute loss:
    • how far output is from desired statistical target
    • KL-Divergence is used
  4. Backpropagate gradients to the input
  5. Update the input image:
    • \( \bar{x} \leftarrow \bar{x} - \alpha \nabla L \)
  6. Repeat

4. Results

They evaluate on large models:

Main findings:

Email Icon reply via email