Notes from the Wired

Surrogate Gradient Learning in Spiking Neural Networks

October 10, 2025 | 886 words | 5min read

Paper Title: Surrogate Gradient Learning in Spiking Neural Networks

Link to Paper: https://arxiv.org/abs/1901.09948

Date: 3 May 2019

Paper Type: Neuromorphic, Architecture, RNN, SNN, Training

Short Abstract: This paper provides an overview of the problems encountered when training Spiking Neural Networks (SNNs) and the different approaches proposed to solve them.

1. Introduction

Our brains are highly efficient; therefore, taking inspiration from them is a natural approach. Recurrent Neural Networks (RNNs) have become very powerful for solving noisy time-series prediction problems. These RNNs share similarities with our brain. Based on these similarities, spiking leaky integrate-and-fire (LIF) networks have been developed, which are widely used today.

RNNs are difficult to train due to noise and the challenge of capturing long-range temporal and spatial dependencies. This problem is even worse in SNNs because the depth of the network is of utmost importance when solving complex tasks, making the training process crucial.

2. Understanding SNNs as RNNs

When the author uses the term RNN, they mean it in the broadest sense—referring not only to networks with explicit connections from past to future neurons, but also to those with temporal dynamics within a single neuron, i.e., a neuron can “remember” information.

A LIF neuron can be described as:

$$ U[n+1] = \beta U[n] + I[n] - S[n] $$

The state of neuron i is determined by the synaptic current (I), which represents the input, and the membrane voltage (U), which represents the state. Since the neuron “remembers” previous states through U[n], it can be considered a type of RNN.

3. Training RNNs

To train a powerful RNN, we need several key components:

Gradient descent, in particular, is able to solve the credit assignment problem, which can be separated into:

BPTT has a space complexity of (O(NT)), where (N) is the number of neurons per layer and (T) is the number of time steps.

4. Credit Assignment with SNNs

There are two main challenges when applying RNN training methods to SNNs:

  1. The non-differentiability of the SNN, more precisely of the Heaviside activation function
  2. BPTT is very expensive in terms of computation and memory, making it poorly suited for neuromorphic processors

To overcome these challenges, several approaches have been proposed:

  1. Using entirely biologically inspired local learning rules
  2. Translating conventional neural networks to SNNs
  3. Smoothing the network to make it differentiable
  4. Defining surrogate gradients as relaxations of real gradients

This paper focuses primarily on methods (3) and (4).

5. Smoothed SNNs

We can further categorize smoothing approaches into:

6. Surrogate Gradients

7. Applications

Some applications of smoothed or surrogate gradient methods for SNNs include:

Email Icon reply via email