It’s Time for AI to Perceive Time.

7 min readSep 14, 2020

The Mutilation of Uranus (god of the sky) by Kronos (god of time) by Giorgio Vasari and Gherardi Christofano 16th century Palazzo Vecchio, Florence. {{PD-Art}}

Spiking Neural Networks (SNN) has recently been a topic of interest in the field of Artificial Intelligence. The premise behind SNN is that neurons in the brain, unlike our current modelling of it, communicate with one another via spike trains that occur at different frequencies and timings. Another way of visualizing the workings of natural neural netwoks is to image a pond with waves interacting with one another forming variety of patterns. The crucial advantage of SNN is the ability to encode time in a more meaningful way by making use of relative timings of the spikes.
New hardware and math solutions are being worked on in the research community to make SNN practical. I argue that the existing deep learning framework is already capable of achieving what SNN promises without the need for new hardware or math solutions. The current deep learning framework using complex numbers instead of real numbers for neuronal activations and weights and also for input data representation can encode time as meaningfully as SNN.

Phase delta and time

In the current deep learning framework, the encoding of time remains clumsy. We encode it either explicitly as a numerical feature or implicitly via sequential data. The latter is only valid in the case of recurrent neural networks. However, even in recurrent networks, time does not fully enter the picture, because ordering alone does not encode relative time distances between input points.

In physics, interference is a phenomenon in which two waves superimpose to form a resultant wave of greater, lower, or the same amplitude. The shape of the resulting wave, assuming equal frequencies, depends on the phase delta. Check out this simulation to get a visual understanding.

Figure 1. Two waves at different phases.

The interference dynamics of waves can be used to encode time. It can also be useful for other problems in deep learning that are not directly related to time. For a Convolutional Neural Network (CNN), a mere presence of eyes, a nose, and a mouth can be enough to declare something as a face. The relative spatial relations between these components and their orientation does not trigger the network to change its decisions. G.Hinton’s capsule networks idea aims at solving this problem in deep learning.

SNN, with its neuromorphic hardware, is fit to make use of wave dynamics to solve these issues in AI. However, there is another option to consider before going for the expensive route. Complex numbers are used heavily in physics and engineering to formulate and model wave dynamics. Therefore their usage in deep learning can be a shortcut to teach AI about time.

Complex numbers to encode waves

I start with a simple assumption that all the waves we are going to deal with are pure cosine or sinusoid waves of the same frequency, and they differ only in amplitude and phase. I will revisit this assumption later in the discussion. Amplitude and phase can be represented with a complex number.

1+3i has an amplitude of √10, which will encode the amplitude. Its phase θ (~ 71.5°) is the phase of the wave, defined in degree distance from some reference point. Now, let’s see what follows if neuronal activations and weights are complex numbers instead of real. This switch is not just a math trick; it changes the way we perceive neural networks.

Implications of using complex numbers in neural networks

The switch from real to complex numbers changes the way neurons excite and inhibit one another. Relative phase differences between incoming signals to a neuron affect its output, even when input amplitudes remain the same. A neuron can inhibit its neighbor by sending a signal that is out of phase with other inputs it receives. (see Figure 3).

Figure 3. If neuron A sends a strong signal which is out of phase with a+bi and c+di, it will weaken the final activation of B.

One might argue that the same effect of inhibition is achieved in real-valued networks through edges that bear negative weight. The problem of negative weight is that such an edge inhibits every neuron it connects to, unlike a complex-valued edge that can appear inhibitive to one neuron and excitatory to another. A single edge resulting in different outcomes depending on which neuron it is connected can result in richer data representations.

Feeding input data in waves

Input data must also be represented as waves to make full use of complex-valued neural networks. Fourier transform interprets given data as a wave in space and time and represents it as a summation of pure cosine and sinusoid waves in space and time. These constituent pure waves are of different phases, amplitudes, and frequencies. For example, the Fourier transform of a 2-D image — pixel values spread in 2-D space — returns phase and magnitude maps spread in frequency space, whose coordinates are frequency values. By superimposing magnitude (or amplitude) and phase maps, we get a matrix of complex numbers (see Figure 4).

Figure 4. Fourier transform of a sample image. Taken from https://homepages.inf.ed.ac.uk/rbf/HIPR2/fourier.htm on Sep 12, 2020. Fourier transforms treats the input as a complex wave in two-dimensional pixel space and disassembles it into pure ways spread in all possible directions over the same two-dimensional pixel space. The magnitude and phase map contains all the necessary information about these constituent pure waves.

Here, the coordinate of a point [f1,f2] denotes the frequency and the direction of a 2-D pure wave (see Figure 5). The further away the point is from the center, the higher the frequency of the pure wave the point represents. The original image can be reconstructed from amplitude and phase information. There is an ongoing discussion on under what conditions the phase map contains more information about data or vice versa ([1],[2],[3]). This discussion reminds us of the problems that Hinton’s capsule networks aim to solve.

Figure 5. The matrix of complex numbers from Figure 4 in more detail. Every point denotes a direction and the frequency coordinate of a pure wave. The complex value sitting at the coordinate represents the amplitude and the phase of this wave.

The Fourier transform also solves our problems with temporal data. Instead of appealing to ad-hoc methods of padding, or feeding sparse sequential data to encode relative time distances between points, the Fourier gives us a compact way of encoding these relative timings in the form of phase maps. The Fourier saves us from the trouble of deciding on a sequence length for RNN because it transforms data in the time domain to the data in the frequency domain. The higher frequencies generally correspond to noise, and most information is concentrated in low frequencies. Now, instead of deciding on some arbitrary sequence length, we can decide on a frequency treshold or a spectrum that has an intuitive meaning. One might argue that this way of encoding sequential data might diminish the need for RNNs altogether. Coincidentally, RNN is giving way to transformer networks already.

There is another implication of feeding a neural network a matrix of complex numbers defined as above. Since the coordinate of a point in the matrix denotes frequency, neurons in the first layer will receive data on specific frequency ranges, thereby implicitly encoding frequency information. Frequency (equivalent to firing rate in neuroscience terms), as well as relative timings of the spikes, are believed to be the basis for information coding in biological brains. SNN mimicking natural networks more accurately makes explicit use of frequency in order to code information. A complex-valued neural network, on the other hand, uses frequency only implicitly and makes up for it by using amplitudes explicitly. Here, an analogy to wave-particle duality in physics comes to the surface. The duality in AI is also along the similar lines of being frequency versus amplitude-based computing.

Conclusion

Spiking Neural Networks is an attempt to address real gaps in the current deep learning paradigm. Proponents of SNN are correct in arguing that behind the rich representational capacity of natural networks lies its wave-based dynamics. However, the application of complex numbers in deep learning can cure the imbalance. There are few papers by well-known figures in AI exploring the use of complex numbers in deep learning, though they have not gained enough attention. I think it is because they mostly focus on practical aspects of it rather than theoretical justifications. In this article, I put forward a theoretical case for complex numbers as an alternative to SNN.

References

[1] A. V. Oppenheim and J. S. Lim, “The importance of phase in signals,” in Proceedings of the IEEE, vol. 69, no. 5, pp. 529–541, May 1981, DOI: 10.1109/PROC.1981.12022.
[2] Morgan, M.J., Ross, J. & Hayes, A. The relative importance of local phase and local amplitude in patchwise image reconstruction. Biol. Cybern. 65, 113–119 (1991). https://doi.org/10.1007/BF00202386
[3] Deepa Kundur (2013). Retrieved from https://www.comm.utoronto.ca/~dkundur/course_info/signals/notes/Kundur_FourierMagPhase.pdf