Zeroth-Order Spiking Neural Network Training

I designed an end-to-end training framework for spiking neural networks (SNN) for this project. Spiking neural networks are inherently non-differentiable due to the discontinuity introduced by the Heaviside step function, rendering first-order gradient methods unsuitable. To circumvent this problem, I used a zeroth-order estimate for the gradient of the step function, a technique called LocalZO introduced in this paper.

Since the zeroth-order gradient estimate admits high variance, I introduce a smaller recurrent (LSTM) neural network that learns how to reduce the estimate's variance — a technique known as meta-learning. This paper showcases the exact coordinate-wise LSTM architecture.

After implementing my design, I tested my approach on the MNIST dataset for digit classification. I train various sizes of linear and convolutional spiking models, benchmarking my results against state-of-the-art optimizers like Adam and AdaGrad. My results demonstrate fast convergence and low loss on specific networks with around 300k parameters.