On December 7, AMD officially launched its flagship AI GPU accelerator Instinct MI300X, the world's first data center APU Instinct MI300A, and the Ryzen 8040 series APU to upgrade XDNA AI NPU at the AI GPU conference held in San Jose, California.
The launch of the new money king fried product ignited the entire semiconductor industry, pushing AMD's share price up about 10 per cent directly after the launch. In particular, the two major killers of Instinct MI300X and MI300A have attacked the strategic hinterland of Nvidia, the giant of the AI computing market, so that Nvidia's dominant position in the AI chip market may be the biggest challenge ever.
AI operation is only suitable for GPU? Take a look at AMD EPYC,CPU or Hold, we know that AI is the next big era of global scientific and technological development, and it is also a new driving force to change thousands of industries, especially since the beginning of this year, the hot circle of chatGPT has made generative AI set off a new wave of global artificial intelligence.
Behind the subversion of human productivity by AI, computing power is a source of fuel and power that is as precious as oil.
AI operation is a field that requires a large number of repeated operations, which is very consistent with the natural characteristics of GPU which is suitable for large-scale concurrent computing. As a result, Nvidia, as a giant of GPU, has become a leader in the AI era. But in any field, "one big company" is not a healthy industrial form, Nvidia popular acceleration card "one card is difficult to get" and high prices and costs, many technology companies suffer, so many companies begin to develop their own AI acceleration chips, or set their sights on replacements.
And AMD is undoubtedly the most anticipated challenger, in fact, from the most important product computing performance point of view, AMD is not disappointing.
For example, the AI GPU accelerator MI300X released this time has 2.4 times the memory capacity of the Nvidia star accelerator H100, 1.6 times its memory bandwidth, and 1.3 times the accuracy of FP8 / FP16 TFLOPS. In 1v1 comparison, training medium kernel FlashAttention 2 model is 10% faster than H100, large kernel is 20% faster than H100, while Llama 2 model training medium kernel 70B parameters is 20% faster than H100, large kernel is 10% faster than H100. And in 8v8 Server comparison, the speed of Llama 270B model is 40% faster than H100 and 60% faster than that of Magical Bloom 176B.
The Instinct GPU AI acceleration series can be so amazing today, which is also the result of many years of iterative development of AMD.
In addition to Instinct GPU, AMD EPYC Xiaolong processor is also the trump card that AMD has been facing the enterprise market for many years.
Speaking of which, I have to talk about a misunderstanding that many people have. As mentioned earlier, GPU is very suitable for AI accelerated computing, which is true, but it does not mean that only GPU is needed for artificial intelligence operations. CPU is equally important.
GPU services for AI accelerated computing need to be in the data center, and the "heart" of the data center is actually CPU. Compared with GPU,CPU, it has the advantages of general computing, independent operation and richer software ecology. To put it simply, the data center can be without GPU, but it can not be without CPU,AI computing.
Moreover, CPU itself can also have powerful AI capabilities, and AMD's EPYC is a good example. For example, in the demonstration area of this conference, AMD uses the EPYC 9654 processor released in November last year to run the Llama 2 language model, which can not only quickly and smoothly complete all kinds of AI processing, but also has a 36% increase in speed compared to the Intel Xeon Platinum 8480 processor.
This fully shows that in some scenarios, only relying on CPU can well complete the operation of the generative AI model, and compared with the high cost of GPU deployment, providing high computing power through CPU can become a more economical and feasible solution for many enterprises that lack GPU resources.
At this point, AMD is definitely the best. For example, according to the latest 62nd global supercomputer ranking in November, the Top500,AMD platform has provided support for 140 of these supercomputers, an increase of 39% year-on-year. Among them, the Oak Ridge National Laboratory's Frontier supercomputer, which is powered by an AMD EPYC 7A53 64-core processor and an Instinct MI250X GPU accelerator, once again tops the list with a performance of 1.194 exaflops.
Frontier is not only the first in performance, but also extremely energy efficient, consuming only 22703KW at the top performance of 1.194 exaflops, about 2000KW less than the second-ranked Argonne National Laboratory Aurora system.
In addition, according to the latest Green500 list, AMD supports 8 of the 10 most energy-efficient supercomputers in the world.
Today, AMD EPYC processors have become the preferred solution for many of the most innovative, energy-efficient and fast supercomputers in the world. Even in the face of the explosive growth of AI accelerated computing demand, they can also show excellent efficiency and scale. This is reminiscent of the 2017 debut of EPYC with a thunder on the ground.
Behind the EPYC, AMD seized on these three points when the AMD EPYC processor was launched in 2017, the data center market showed the dominant trend of Intel x86 top processors, server manufacturers had little choice but to follow the strong footsteps, there was no room for much attention to the architecture design of the server, and at the same time, they could only rely on Intel's asking price.
While Intel was lying on a pile of banknotes to make money, AMD suddenly returned to the server market with the EPYC 7001 series in 2017, winning acclaim in the industry.
The AMD EPYC processor was amazing from its debut, with the highest specification up to 32 cores and 64 threads, which was very violent. Although the price was similar to that of Xeon, its performance was more than 30% higher than that of Zhi Qiang at that time, which put a lot of pressure on its competitors. At that time, the Hewlett-Packard HPE ProLiant DL385 server with dual AMD EPYC 7601 processors directly broke the world records of SPEC 2017 and SPEC 2006, which shows the new atmosphere brought to the industry after the advent of EPYC processors.
Looking at the high-energy development of AMD EPYC processors over the years, CTOnews.com feels that there are three key points:
The first is that crazy "stacking" brings super computing power, the best configuration, the most innovative technologies are not hesitant to use, so that each generation can achieve the highest computing density, the strongest performance, while maintaining the highest energy efficiency.
The second is that AMD has done enough meticulous product division to meet the needs of different markets and scenarios.
Excellent cost performance.
These three points should be the secret that AMD can counterattack all the way. I believe you can feel it through the following introduction.
For example, in 2019, AMD released the second generation of EPYC 7002 series processors codenamed "ROME". Not only did the industry take the lead in using 7nm technology on server chips, but the Zen architecture was also upgraded to the second generation. Because the 7nm core is smaller, AMD has inserted up to twice the core of the previous generation EPYC in the 7002 series CPU while maintaining a higher clock speed, with a maximum of 64 cores and more than 128 PCIe 7002 channels, with only 225W TDP. The acceleration frequency can reach 3.4GHz, and the performance of the strongest EPYC 7742 is up to 97% higher than the 8280L at the time of Intel Xeon.
AMD's pursuit of advanced technology and innovation does not stop there. For example, an important innovation in the EPYC Milan-X 7073 series of processors launched at the end of 2021 is the use of 3D V-Cache technology in its debut.
To put it simply, 3D V-Cache is to stack SRAM core particles directly on top of CPU, and then transmit data through silicon through hole technology, which is equivalent to memory and CPU "face-to-face output". The transmission speed is conceivable, and there is a great improvement in bandwidth and memory capacity. For example, the cache of this generation of flagship processor EPYC 7773X has reached the terrifying 768MB.
Then in November 2022, AMD's latest fourth-generation EPYC processor, the 9004 series codenamed "Genoa", was officially released.
Interjecting here, between the launch of AMD EPYC in 2017 and the release of Genoa, AMD has been rapidly gobbling up Intel's market share, with AMD's market share for cloud services chips based on x86 architecture growing directly from zero in 2016 to about 29% in 2021, according to research firm IDC at the time.
Look at the EPYC 9004 series processors, using the leading 5nm technology, Zen 4 architecture, up to 96-core 192 threads, 4.4GHz acceleration frequency, single-channel maximum 6TB DDR5 memory and 128 PCIe Gen 5 buses, three-level cache up to 384 MB, supporting CXL1.1+ memory expansion, expanding AMD Infinity Guard in terms of security, and the number of encryption keys has been tripled.
All these innovative highlights are included in the EPYC 9004 series, and Intel postponed the release of the fourth-generation Xeon scalable processor in January this year, which is Intel's first extremely strong processor based on Chiplet design, and this promising technology, AMD, is already in layout in the first generation of EPYC processors.
In terms of other parameters, the fourth generation Xeqiang up to 60 cores, Intel 7 process (formerly 10nm), single-channel maximum 4TB DDR5 memory, 80 PCIe 5.0channels, as well as 112.5MB level 3 cache and 4.2GHz to high frequency, are basically completely suppressed by the EPYC 9004 series.
But at the same time, its price is much higher than AMD. The 56-core Xeqiang platinum 9480 (US $12980) is much more expensive than the 96-core EPYC 9654 (US $11805), while the 48-core EPYC 9454 (US $5225) is nearly half cheaper than the 48-core Xeon 9468 (US $9900).
Under the strong dialogue, AMD's fourth-generation EPYC flagship product 9654 compares with the flagship strong Platinum 8490H, which leads 8490H in the cloud service application performance benchmark test (2P SPECrate@2017_int_base). At the same time, the enterprise computing performance is 1.7-1.9 times higher, the energy efficiency is 1.8 times higher, and the performance-to-price ratio is as high as 2.58 times.
In the PassMark running score list on January 20 this year, EPYC 9654 topped the list for the first time. At the time of writing, the editor checked the latest list. EPYC 9654 is still the number one enterprise processor, and in this list, AMD shows the trend of killing the list.
After the release of the "Genoa" 9004 series, it has also received a response from major technology companies. For example, Amazon Cloud AWS has launched a general computing example of M7A based on "Genoa", with a performance improvement of 50% over the previous generation. In addition, many major companies, such as Asustek, Tencent Cloud and Lenovo, have also launched server solutions with fourth-generation EPYC processors.
The fourth generation of EPYC also fully embodies AMD's strategy of carefully dividing the product line to meet the business needs of different scenarios. For example, in June this year, AMD launched both the Genoa-X series and the EPYC 97X4 series (Bergamo) processors for the cloud native market.
EPYC Genoa-X is used to replace the previous Milan-X series, this time under the blessing of 3D V-Cache technology, AMD stacked 64MB 3D cache for each CCD, coupled with the original 32MB cache within each CCD, the 9004 series processor has up to 12 CCD, that is to say, its L3 cache can achieve a scary 1152MB at most, realizing that the cache capacity of a single CPU chip breaks through 1GB for the first time!
At the same time, the reference frequency of the EPYC Genoa-X is higher than that of the previous 9004 series products, coupled with a larger cache capacity, and the maximum power consumption is 400W. However, the performance gain brought by this is also quite obvious. The domestic media MC evaluation room has previously tested Genoa-X 's flagship product, EPYC 984X, which has taken the lead over previous products such as EPYC 9654 and EPYC 9554 in a number of benchmark tests such as SPECrate 2017, UnixBench Dhrystone 2 and Whetstone.
Photo from: the EPYC 97X4 series, codenamed Bergamo, the MC evaluation room, is mainly for cloud native scenarios. Cloud computing manufacturers pay more attention to the number of cores, data bandwidth, etc., and need an efficient, agile, scalable computing environment, so the EPYC 97X4 series uses a simplified Zen 4c core architecture, which reduces the cache capacity compared to Zen 4 processors. Each core is reduced from the original 4MB to 2MB, but the number of cores has reached 128. this core density is the highest in the industry. In addition, Zen 4c is completely consistent with the Zen 4 architecture in terms of architecture design, process, instruction set, IPC performance, etc., and all the top features have been retained.
According to reports from foreign media Hardwaretimes at that time, the series of flagship EPYC 9754 processors scored 221018 points under the V-Ray 5 running score 2S configuration, 2.4 times higher than the competitive Xeon Platinum 8490H processor.
At the same time, in the comparison of cloud computing performance, compared with the most powerful Platinum 8490H and 8480 +, EPYC 9754 can take the lead up to 2.65 times and the lowest 1.49 times.
The MC evaluation room we mentioned earlier has also done longitudinal tests on EPYC 9754. EPYC 9754 of the two-way system has significantly improved compared with its own EPYC 9754, EPYC 9554 and other products in SPECrate 2017, NAMD, OpenSSL, UnixBench Dhrystone 2, Whetstone, Sysbench CPU and other tests, with a maximum improvement of 23.5%.
Photo from: MC evaluation room is not enough. In September this year, AMD launched the AMD EPYC 8004 Series processor (Siena) for smart edge applications such as retail, manufacturing, and telecommunications and cloud services scenarios, further improving the fourth-generation EPYC family.
The 8004 series processors also feature Zen 4c cores, bringing new SP6 slots for faster memory and I / O functions, up to 64 cores with 128threads, 6 channels of DDR5 memory up to 1.152TB, and 96 PCIe 4 channels. With such high performance, the default TDP is only 200W. Such excellent performance and energy efficiency performance can well meet the needs of all kinds of edge infrastructure in the case of limited space and power consumption.
In video coding workloads, the EPYC 8534P provides a leading total number of frames per hour per system watt. In the IoT Edge Gateway workload, the server with 8-core EPYC 8024p performs well in the overall throughput graph performance per 8kW rack.
After the release of AMD EPYC 8004 series processors, many OEM manufacturers also released at the same time a number of unique systems and solutions that take full advantage of EPYC 8004 series processors, such as Dell Technology's Dell PowerEdge C6615 server, Ericsson's Cloud RAN computing acceleration solution, Microsoft Azure cloud service, Ericsson Cloud RAN computing acceleration solution, and so on.
Having said so much, I believe you can feel that AMD EPYC has been able to break through in the enterprise market since its birth, precisely because they have firmly grasped three key points, namely, high core, high master frequency, high cache performance, excellent performance-to-price ratio that many enterprises and cloud service providers care about, and the strategy of constantly extending like market segments to provide optimal solutions for different load scenarios.
Years of continuous iteration and innovation have made AMD EPYC more and more solid in the market, and gradually built a better software and hardware ecology. They have established extensive cooperation in operating systems, security, infrastructure, AI, database, high-performance computing and other fields, and continue to fulfill the promises of the market and customers.
In conclusion, AMD CEO Su Zifeng said at the Advancing AI conference that the total market size of artificial intelligence chips could climb to $400 billion in the next four years, compared with AMD's estimate of $150 billion a year ago, which has more than tripled.
The wave of generative AI is believed to be the key factor for AMD to be more optimistic about the future development of AI, because it makes ordinary consumers feel the energy of AI to change the world for the first time.
We believe that in the next era of the explosion of computing demand led by generative AI, the importance of CPU will not diminish, but will become stronger and stronger, giving play to its value in more scenarios that require AI participation.
AMD is ready for this, and the EPYC CPU and Instinct accelerators have become their two trump cards. Throughout the semiconductor market, there are few all-round players like AMD who have blossomed in CPU, GPU and even FPGA and various adaptive SoC fields. EPYC CPU, in particular, has gone through four consecutive generations of evolution, showing the highest computing density, excellent performance and efficiency in the industry. It has high core, huge cache, high frequency and rich technical features, but also has a very high performance-to-price ratio. Has gradually become the first choice for data center customers. All these will help AMD release more energy in the AI era.
Maybe the future, AMD YES! It is no longer just the stem that spreads among digital enthusiasts and consumers, but comes from the recognition of AMD with AI and numeric power from the whole industry.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Thanks to CTOnews.com netizen epace for the clue delivery! CTOnews.com May 22 news, Hewlett-Packard Spectre x360 14 flipped now on the official store, offering mocha gold, Nocturne blue two colors, equipped with Intel
Thanks to CTOnews.com netizens can1 and Wu Yanzu in South China for their clue delivery! CTOnews.com, December 30, in order to commemorate the upcoming lunar year of the Rabbit, Apple Apple Store suddenly entered a state of maintenance last night.
CTOnews.com December 7 news, Raytheon K98 cable mechanical keyboard recently released, the price of 169yuan, the initial price of 149yuan, is now on pre-sale. The keyboard has two colors of cedar / ebony, with 98 keys and keypad, 95%
CTOnews.com November 21 news, Kyushu Fengshen has now launched the big frost tower digital display version of the radiator, JD.com 's current price of 201.81 yuan. CTOnews.com finishing Kyushu Fengshen big frost tower digital display version radiator introduction is as follows: Kyushu Fengshen big frost tower digital display version scattered
Thanks to CTOnews.com netizens for the delivery of clues about the past. CTOnews.com August 24 news, according to China Communications News, at today's news conference of the Ministry of Transport, the relevant person in charge introduced the transportation new business type platform enterprises to reduce too high