The data center of tomorrow is made up of heterogeneous accelerators [Part 2]: Biological Inspired

10 min readNov 13, 2021

Biologically inspired systems

DNA is also a source of inspiration. A team recently published in the journal Science the capacity to store digital data on a DNA molecule. They were able to store an operating system, a French film from 1895 (L’Arrivée d’un train à La Ciotat by Louis Lumière), a scientific article, a photo, a virus and a $ 50 gift card in DNA strands and retrieve the data without errors.

Indeed, a DNA molecule is intended to store information by nature. Genetic information is four nitrogenous bases that make up a DNA molecule (A, C, T, and G). Today it is possible to transcribe digital data into a new code. DNA sequencing then makes it possible to read the stored information. Encoding is automated through software. A DNA molecule is 3 billion nucleotides (nitrogenous base). In one gram of DNA, 215 petabytes of data can be stored. It would be possible to store all the data created by humans in one room. In addition, DNA can theoretically keep data in perfect condition for an extremely long time. Under ideal conditions, it is estimated that DNA could still be deciphered after several million years thanks to “longevity genes”. DNA can withstand the most extreme weather conditions. The main weak points today are high cost and processing times which can be extremely long.

The term AI as such appeared in 1956. Several American researchers, including John McCarthy, Marvin Minsky, Claude Shannon and Nathan Rochester of IBM, very advanced in research that used computers for other than scientific calculations, met at the University of Dartmouth, in the United States. Three years after the Dartmouth seminar, the two fathers of AI, McCarthy and Minsky, found the AI lab at MIT. There was a lot of investment, too much ambition, to imitate the human brain, and a lot of hope that was not realized at the time. The promises were broken. A more pragmatic approach appeared in the 1970s and 1980s, which saw the emergence of machine learning and the reappearance of neural networks in the late 1980s. This more pragmatic approach, the increase in computing power and the explosion of data has made it possible for AI to be present in all areas today, it is a transversal subject. The massive use of AI poses some challenges such as the need to label the data at our disposal. The problem with automation is that it requires a lot of manual work. AI needs education. This is done by tens of thousands of workers around the world which does not really look like what you might call a futuristic vision. Another challenge is the need for computing power. AI needs to be trained and for this AI is more and more greedy in terms of calculations. The training requires a doubling of the computing capacities every 3.5 months (10).

**Source:** AI and Compute, OpenAI, https://openai.com/blog/ai-and-compute/#fn1

Several approaches are currently used and envisaged. Today for example, as for the Summit supercomputer, the calculation of certain workloads is deported to accelerators such as GPUs. There are others such as FPGAs (Field Programmable Gate Arrays or “programmable logic networks”) which can realize the desired digital functions. The advantage is that the same chip can be used in many different electronic systems.

Progress in the field of neuroscience will allow the design of processors directly inspired by the brain. The way our brain transmits information is not binary. And it is thanks to Santiago Ramón y Cajal (1852–1934), Spanish histologist and neuroscientist, Nobel Prize in physiology or medicine in 1906 with Camillo Golgi that we know better the architecture of the nervous system. Neurons are cellular entities separated by fine spaces, synapses, and not fibers of an unbroken network (11). The axon of a neuron transmits nerve impulses, action potential, to target cells. The next step in developing new types of AI-inspired and brain-inspired processors is to think differently about how we compute today. Today one of the major performance problems is the movement of data between the different components of the von Neumann architecture: processor, memory, storage. It is therefore imperative to add analog accelerators. What dominates numerical calculations today and in particular deep learning calculations is floating point multiplication. One of the methods envisaged as an effective means of gaining computational power is to go back by reducing the precision also called approximate calculation. For example, 16-bit precision engines are more than 4x smaller than 32-bit precision engines (1). This gain increases performance and energy efficiency. In simple terms, in approximate calculation, we can make a compromise by exchanging numerical precision for the efficiency of calculation. Certain conditions are nevertheless necessary such as developing algorithmic improvements in parallel to guarantee iso-precision (1). IBM recently demonstrated the success of this approach with 8-bit floating-point numbers, using new techniques to maintain the accuracy of gradient calculations and updating weights during backpropagation (12) (13). Likewise, for the inference of a model resulting from the deep learning algorithm training, the unique use of whole arithmetic on 4 or 2 precision bits achieves accuracy comparable to a range of popular models and data sets (14). This progression will lead to a dramatic increase in computing capacity for deep learning algorithms over the next decade.

Analog accelerators are another way of avoiding the bottleneck of von Neumann’s architecture (15) (16). The analog approach uses non-volatile programmable resistive processing units (RPUs) that can encode the weights of a neural network. Calculations such as matrix or vector multiplication or the operations of matrix elements can be performed in parallel and in constant time, without movement of the weights (1). However, unlike digital solutions, analog AI will be more sensitive to the properties of materials and intrinsically sensitive to noise and variability. These factors must be addressed by architectural solutions, new circuits and algorithms. For example, analogous non-volatile memories (NVMs) (17) can effectively speed up backpropagation algorithms. By combining long-term storage in phase-change memory (PCM) devices, quasi-linear updating of conventional CMOS capacitors and new techniques to eliminate device-to-device variability, significant results began to emerge for the calculation of Deep Neural Network (18) (19) (20). The research also embarked on a quest to build a chip directly inspired by the brain (21). In an article published in Science (22), IBM and its university partners have developed a processor called SyNAPSE which is made up of a million neurons. The chip consumes only 70 milliwatts and is capable of 46 billion synaptic operations per second, per watt, literally a synaptic supercomputer holding in a hand. We have moved from neuroscience to supercomputers, a new computing architecture, a new programming language, algorithms, applications and now a new chip called TrueNorth (23). TrueNorth is a neuromorphic CMOS integrated circuit produced by IBM in 2014. It is a many-core processor network, with 4096 cores, each having 256 programmable simulated neurons for a total of just over one million neurons. In turn, each neuron has 256 programmable synapses allowing the transport of signals. Therefore, the total number of programmable synapses is slightly more than 268 million. The number of basic transistors is 5.4 billion. Since memory, computation and communication are managed in each of the 4096 neurosynaptic cores, TrueNorth bypasses the bottleneck of the von Neumann architecture and is very energy efficient. It has a power density of 1 / 10,000 of conventional microprocessors.

References

1. The Future of Computing: Bits + Neurons + Qubits. Green, Dario Gil and William MJ arXiv: 1911.08446 [physics.pop-ph].

2. ECRAM as Scalable Synaptic Cell for High-Speed, Low-Power Neuromorphic Computing. Jianshi Tang, Douglas Bishop, Seyoung Kim, Matt Copel, Tayfun Gokmen, Teodor Todorov, SangHoon Shin, Ko-Tao Lee, Paul Solomon, Kevin Chan, Wilfried Haensch, John Rozen. IEEE-IEDM (2018).

3. Neuromorphic computing using non-volatile memory. GW Burr, RM Shelby, A. Sebastian, S. Kim, S. Kim and e. al. 2016, Advances in Physics: X, Vol. flight. 2, pp. pp. 89–124.

4. TrueNorth: Accelerating From Zero to 64 Million Neurons in 10 Years. al, MV DeBole and. no. 5, May 2019, Computer, Vol. flight. 52, pp. pp. 20–29.

5. A Symbolic Analysis of Relay and Switching Circuits. Shannon, Claude E. sl : Massachusetts Institute of Technology, Dept. of Electrical Engineering, 1940.

6. A Mathematical Theory of Communication. Shannon, Claude E. p. 379–423 and 623–656, sl: Bell System Technical Journal, 1948, Vol. flight. 27.

7. The Mathematical Theory of Communication. Claude E. Shannon, Warren Weaver. Urbana, Illinois: The University of Illinois Press, 1949.

8. Molecular digital data storage using DNA. Luis Ceze, Jeff Nivala, Karin Strauss. sl : Nat Rev Genet, 2019, Vol. 20.

9. IBM Z mainframe capabilities. [Online] https://www.ibm.com/it-infrastructure/z/capabilities?cm_mmc=OSocial_Twitter-_-Systems_Systems+-+Cross-_-WW_WW-_-Zstats-87percent&linkId=72022252&fbclid=IwAR3gti8qo5F5APjqjMoKFS3LmS0WwiKqZ6fejABlK3w6t7QJLW69CP0ZpM8.

10. Peng, Tony. AI Doubling Its Compute Every 3.5 Months. [Online] https://syncedreview.com/2018/05/17/ai-doubling-its-compute-every-3-5-months/.

11. The discovery of dendritic spines by Cajal. Yuste, Rafael. sl: Front Neuroanat, 2015.

12. Gradient-based learning applied to document recognition. Y. LeCun, L. Bottou, Y. Bengio and P. Haffner. 1998, Proceedings of the IEEE, Vol. flight. 86, pp. pp. 2278–2324.

13. Deep learning with limited numerical precision. S. Gupta, A. Agrawal, K. Gopalkrishnan and P. Narayanan. 2015, International Conference on Machine Learning.

14. PACT: PARAMETERIZED CLIPPING ACTIVATION FOR QUANTIZED NEURAL NETWORKS. Jungwook Choi, Zhuo Wang, Swagath Venkataramani, Pierce I-Jen Chuang, Vijayalakshmi Srinivasan, Kailash Gopalakrishnan. 17 Jul 2018, arXiv: 1805.06085v2 [cs.CV].

15. The next generation of deep learning hardware: Analog computing. W. Haensch, T. Gokmen and R. Puri. 2018, Proceedings of the IEEE, Vol. flight. 107, pp. pp. 108–122.

16. Equivalent- accuracy accelerated neural-network training using analog memory. S. Ambrogio, P. Narayanan, H. Tsai, R. Shelby, I. Boybat and e. al. 2018, Nature, Vol. flight. 558, pp. pp. 60–67.

17. Weight programming in DNN analog hardware accelerators in the presence of NVM variability. C. Mackin, H. Tsai, S. Ambrogio, P. Narayanan, A. Chen and GW Burr. 2019, Advanced Electronic Materials, Vol. flight. 5, pp 1900026.

18. Neuromorphic computing using non-volatile memory. GW Burr, RM Shelby, A. Sebastian, S. Kim, S. Kim and e. al. 2016, Advances in Physics: X, Vol. flight. 2, pp. pp. 89–124.

19. Multilevel-Cell Phase-Change Memory: A Viable Technology. 6 (1), 87–100, 2016., IEEE J. Emerging and Selected Topics in Circuits and Systems.

20. Recent Progress in Phase-Change Memory Technology. GW Burr, MJ Brightsky, A. Sebastian, H.-Y. Cheng, J.-W. Wu, S. Kim, NE Sosa. N. Papandreou, H.-L. Lung, H. Pozidis, E. Eleftheriou, CH Lam. IEEE J. Emerging and Selected Topics in Circuits and Systems, Vol. 6 (2), 146–162, 2016.

21. Neuromorphic computing with multi-memristive synapses. Irem Boybat, Manuel Le Gallo, SR Nandakumar, Timoleon Moraitis, Thomas Parnell, Tomas Tuma, Bipin Rajendran, Yusuf Leblebici, Abu Sebastian & Evangelos Eleftheriou. sl: Nature Communications, 2018, Vol. 9.

22. A million spiking-neuron integrated circuit with scalable communication network and interface. al., Paul A. Merolla and. Issue 6197, pp. 668–673, sl: Science, 2014, Vol. Flight. 345.

23. TrueNorth: Accelerating From Zero to 64 Million Neurons in 10 Years. al., Michael V. DeBole and. pp. 20–28, sl: IEEE Computer, 2019, Vol. 52.

24. Feynman, Richard. sl: International Journal of Theoretical Physics, 1982, Vol. Vol 21, Nos. 6/7, 1982.

25. Nay, Chris. IBM Opens Quantum Computation Center in New York; Brings World’s Largest Fleet of Quantum Computing Systems Online, Unveils New 53-Qubit Quantum System for Broad Use. [Online] https://newsroom.ibm.com/2019-09-18-IBM-Opens-Quantum-Computation-Center-in-New-York-Brings-Worlds-Largest-Fleet-of-Quantum-Computing- Systems-Online-Unveils-New-53-Qubit-Quantum-System-for-Broad-Use.

26. Validating quantum computers using randomized model circuits. Andrew W. Cross, Lev S. Bishop, Sarah Sheldon, Paul D. Nation, and Jay M. Gambetta. sl: arXiv: 1811.12926v2 [quant-ph], 2019.

27. Hardware-efficient variational quantum eigensolver for small molecules and quantum magnets. Abhinav Kandala, Antonio Mezzacapo, Kristan Temme, Maika Takita, Markus Brink, Jerry M. Chow & Jay M. Gambetta. pages 242–246, sl: Nature, 2017, Vol. volume 549.

28. Computational Investigations of the Lithium Superoxide Dimer Rearrangement on Noisy Quantum Devices. Qi Gao, Hajime Nakamura, Tanvi P. Gujarati, Gavin O. Jones, Julia E. Rice, Stephen P. Wood, Marco Pistoia, Jeannette M. Garcia, Naoki Yamamoto. sl: arXiv: 1906.10675 [quant-ph], 2019.

29. Quantum risk analysis. Stefan Woerner, Daniel J. Egger. sl: npj Quantum Information, 2019, Vol. volume 5.

30. Quantum Generative Adversarial Networks for Learning and Loading Random Distributions. Christa Zoufal, Aurélien Lucchi, Stefan Woerner. sl: arXiv: 1904.00043 [quant-ph].

31. Amplitude estimation without phase estimation. Yohichi Suzuki, Shumpei Uno, Rudy Raymond, Tomoki Tanaka, Tamiya Onodera, Naoki Yamamoto. sl: Quantum Information Processing, 19, 75, 2020.

32. Credit Risk Analysis using Quantum Computers. Daniel J. Egger, Ricardo Gacía Gutiérrez, Jordi Cahué Mestre, Stefan Woerner. sl: arXiv: 1907.03044 [quant-ph].

33. Option Pricing using Quantum Computers. Nikitas Stamatopoulos, Daniel J. Egger, Yue Sun, Christa Zoufal, Raban Iten, Ning Shen, Stefan Woerner. sl: arXiv: 1905.02666 [quant-ph].

34. Improving Variational Quantum Optimization using CVaR. Panagiotis Kl. Barkoutsos, Giacomo Nannicini, Anton Robert, Ivano Tavernelli, Stefan Woerner. sl: arXiv: 1907.04769 [quant-ph].

35. Supervised learning with quantum-enhanced feature spaces. Vojtěch Havlíček, Antonio D. Córcoles, Kristan Temme, Aram W. Harrow, Abhinav Kandala, Jerry M. Chow & Jay M. Gambetta. pages 209–212, sl: Nature, 2019, Vol. volume 567.

36. Analysis and synthesis of feature map for kernel-based quantum classifier. Yudai Suzuki, Hiroshi Yano, Qi Gao, Shumpei Uno, Tomoki Tanaka, Manato Akiyama, Naoki Yamamoto. sl: arXiv: 1906.10467 [quant-ph].

37. Quantum Chemistry Simulations of Dominant Products in Lithium-Sulfur Batteries. Julia E. Rice, Tanvi P. Gujarati, Tyler Y. Takeshita, Joe Latone, Mario Motta, Andreas Hintennach, Jeannette M. Garcia. sl: arXiv: 2001.01120 [physics.chem-ph], 2020.

38. [Online] https://www.research.ibm.com/frontiers/ibm-q.html.

39. [Online] https://news.exxonmobil.com/press-release/exxonmobil-and-ibm-advance-energy-sector-application-quantum-computing.

The data center of tomorrow is made up of heterogeneous accelerators [Part 2]: Biological Inspired

Written by Xavier Vasques