This article is pretty interesting from authors’ aspect. Like a history teller, he mentioned lots of stories and evolution of computer architecture. The story is quite amazing and well-organized by authors’ tremendous writing. Keep me fresh again.
To make a full summary from my aspect, I start from the very begging of computer architecture. It was like ancient age of human being. The very first CPU starts from MIPS. You can say any but MIPS is the most popular and well-known architecture. Researcher was trying to make progress on MIPS and boom. They found pipeline. As the time flied, researchers kept making leaps on computer architecture. The cost of producing an IC chip was still expensive. One smart guy realized that emulation is needed in order to know the performance result in advanced.
Emulation is a very useful technique. Even now, people still use this technique to solve problems, such as early stage development, legacy-machine code reuse, virtual machine, cloud service, etc. And how emulation grows? It’s another story. Like I said, researcher used cycle-accurate full-system emulation to develop new architectures. However, the speed of emulation decreased as the architecture complexity increased. The time of implementing complex system components grew very fast. Besides, emulation infrastructures were written by sequential C or C++ which makes the behavior of emulation different from researchers’ expects.
Therefore, researchers started to make functional simulator and map most of the component onto real machine. Like QEMU, the target machine code is translated into host binary code and use the power of host machine to simulate the behavior of guest system. This technique empowered the simulation speed and made modules’ simulation simpler. However, it came with cost. The functional simulator caused the performance result imprecise.
The trade-off between accuracy and speed, at the end, came to a compromise. Researcher started to use modulation methodology. They focused on only some of modules accuracy which influenced performance the most. In addition, researchers made these modules precise enough to build an acceptable simulator.
The research went further discussion about whether absolute or relative accuracy is required. Like multi-core process, shared memory, and message passing may differ the simulation results. Just like my research about virtual platform, I discuss about the trade-off between accuracy and simulation-time. And further discuss about how analytic model of pipeline and precise model of cache work together. The researchers hit the same problem as well few years ago. The conclusion is analytic model is not mutually exclusive from simulation. Analytic model can help simulation get better result.
The second important part of this article is benchmarking. Let’s talk about it more.
Benchmarking is important. It can tell how good a system is. Benchmark is like an exam for machines. Using benchmark can help developers to address the performance issues, on the other hand, benchmark cannot address specific performance problem because benchmark is a common case of program combined with many different behaviors. The variety of program behavior causes the performance issue hard to track. Moreover, there is no parameter in benchmarks. A developer can hardly know performance issue from a fixed test-set (benchmark) result. Therefore, researchers developed two new ways to solve these problems, micro-benchmark and synthetic benchmark.
Micro-benchmark is a small subset code of a benchmark. Benchmark is from real world application and therefore has many characteristics in it. The diversity of characteristics makes performance results ambiguous. So, micro-benchmark extracts one of the characteristics of a benchmark. Single characteristic implies more Controllable Variable and less Independent Variable. This further implies the profiling of micro-benchmark can easily address critical design points.
Synthetic benchmark is a different approach of diverse characteristics problem. Synthetic benchmark may be a combination of benchmarks or revised benchmark for adjusting parameters. Controlling parameters generates different load of workload, and can further analyze the performance. For example, a parallel image processing algorithm behaves different as the total image pixels changes. The original test set images are adjusted and cloned into different size of images by developers. And this is synthetic. The new synthetic work-load can examine the number of CPU cores and the cache line size, etc. In conclusion, adjusting parameters of a real benchmark or other combination of benchmarks can really help a lot on analyzing performance and computer architecture.
To sum-up, the computer architecture evaluation is challenging. There are many aspects to think about. The complexity of computer architecture makes the design space grows exponentially. Also, the complexity of computer architecture makes benchmarking performance very hard. It’s impossible to decide the best winner; every computer has its trade-off. About simulation, one needs to trade-off between simulation accuracy and speed. As for benchmarking, it’s impossible to have a one-solution. But, one can approach one’s aims by using micro-benchmark to characterize the behavior, or use synthetic benchmark to adjust parameters of benchmarks. Don’t forget, at the end, a real benchmark to conclude the result is necessary to convince people.
Written by Medicine Yeh