High Performance Computing

Introduction

Moore-law is a well-known, famous prediction which states that the number of transistors per square mm is doubling roughly in every 18 months. This clearly causes the fast development of the available computational power which is the basis of the increasing performance of IT systems. But what will happen after the end of the Moore’s Law?

moore The graphical illustration of Moore’s Law.

The impact on technology and AI

Silicon transistors have been started to change the world by moving the computing power from big machines to desks to laps and to our pockets. Due to this transition cheap computing power became available for the man on the street which made possible for everyone to develop applications, to put their ideas into reality. It is impossible to not recognize the societal impact of this worldwide phenomenon. Startups roughly needs only for ideas, creativity as a fuel, social websites change the way we communicate each other, cloud-based services provide a wide range of softwares, vm-s etc.

The technological impact is not negligible as well. Data analysis, machine learning, self-driving cars, image processing, language processing, climate modeling, developing new materials for batteries and superconductors, and improving drug design rely on supercomputers with high computing performance. For instance state-of-the-art machine learning algorithms most of the time uses deep neural networks. To train large neural networks with huge amount of data requires a lot of computations. This is only feasible by harnessing the power of the best computational resources.

Moore’s Law has another synchronization effect between the manufacturers and the customers: “One of the biggest benefits of Moore’s Law is as a coordination device,” says Neil Thompson, an assistant professor at MIT Sloan School. “I know that in two years we can count on this amount of power and that I can develop this functionality and if you are Intel you know that people are developing for that and that there’s going to be a market for a new chip.” The life in IT will be much more difficult after Moore’s Law because the guaranteed, reliable improvement will stop. Therefore server chip makers have to get creative to find alternative ways to move forward. But when will be the end of this golden era? There are different obstacles to keep the pace: costs of making smaller transistors, energy efficiency of a chip, physical obstacles.

Physical obstacles

The main challenges are posed by the law of physics. Two problems, heat and leakage, are the most relevant. To bring these problems closer we should understand better how a silicon transistor works.

transistor

The above picture shows the most important parts of a transistor. The whole transistor behaves as a switch and provides the 0s and 1s, the computer language elements. When there is a voltage difference between the gate and the body, a so called tunnel will appear between the source and drain and let some current to flow. The switch is on or off depending on whether the current flows or not. Unfortunately, when the drain and source are too close then a quantum effect induces a current even when the voltage is absent. This is called as leakage.

The other problem is when the number of transistors per square mm grows high is the generated heat. Due to the small distances it is difficult to get rid of the thermal energy. In fact the generated heat depends on the operating speed of the processor but it is on a plateau since the mid 2000’s. So the only way to enhance computing speed is to make smaller the transistors.

Now the size of a transistor (the distance between the source and drain) is 14 nm. Already now, it is clear that the Moore’s Law has slowed down and big chip manufacturers started to tweak their designs to enhance the performance. Intel introduced the tick-tock strategy. Tick means an attempt to create smaller transistor while tock means to tweak the chip’s current design. To make it better emphasized: at Intel “the tick to 10nm that was meant to follow the tock of the Skylakes has slipped too; Intel has said such products will not now arrive until 2017. Analysts reckon that because of technological problems the company is now on a “tick-tock-tock” cycle”.

From an economical standpoint it is too expensive to create smaller transistors because the revenue not increasing so fast as the expense of creating smaller transistors.

The Moore’s Law will not finish abruptly but gradually and this is already happening. But what can be the alternative ways to increase the computing power?

Possible solutions

Now we are entering the post-silicon era. During this transition the chip makers try to tweak their design. 3D solution, finFET, modifies the structure of the gate (see previous picture) in a way to avoid leakage. “Gate-all-around” transistors use more special gate. According to Samsung it might take gate-all-around transistors to build chips with features 5nm apart, a stage that Samsung and other makers expect to be reached by the early 2020s. The drawback of this technique is that it adds extra steps to the manufacturing process.

In the long run, special hardwares to different purposes can mean the solution. Nowadays, big companies started to design customized hardwares. For example Google created TPU (Tensor Processing Unit) for deep neural networks. From NVidia Jen-Hsun Huang says that it is now time to make chips customized for DNNs. NVidia invests a lot in improving their GPGPU methodologies. On the other hand, GPUs are not the only one. “It is very expensive and challenging to build, maintain, and scale out your own training platform,” says Eric Chung, a researcher at Microsoft. Intel, and Microsoft started to deal with FPGAs to overcome the challenge of fulfilling cost, performance and energy efficiency. Microsoft’s goal is to use FPGAs to enhance their search-engine Bing. Intel spent nearly $17 billion to acquire leading FPGA manufacturer Altera last year and is adapting its technology to data centers. A more special hardware is the VPU (Visual Processing Unit) by Movidius. Movidius’s Myriad 2 chip can crunch huge amounts of visual information but use less than a watt of power.

Beyond special hardwares, newer principals can help too. Molecular computers and quantum computers, just to mention two of them. None of them can replace silicon but provides alternatives in case of special problems. For instance quantum computer can find the minimum of complicated functions. These technologies are not reliable nowadays and it will take a lot of time to get them work right.

Michio Kaku has a nice talk about Moore’s Law: kakus