CPU can also run the big model Intel 5th generation to the most powerful release perfectly. 07/12 Update SLTechnology News&Howtos

CPU can also run the big model Intel 5th generation to the most powerful release perfectly.

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Shulou(Shulou.com)12/24 Report--

Large language models (LLM) have shown excellent performance and great potential in many fields. However, in order to really give full play to the powerful capabilities of these models, we need a strong computing infrastructure, and the chip is the key.

Thousands of calls for her to come out shyly, the fifth generation Intel ®️strong ®️scalable processor, here it comes!

If its characteristics are summed up in one sentence, it is that AI is getting stronger and stronger.

Take the large model of training and reasoning as an example:

Compared with the fourth generation, the training performance is improved by up to 29%, and the reasoning performance is improved by as much as 42%.

Compared with the third generation, the training and reasoning performance of AI is up to 14 times higher.

What concept?

Now if the model with no more than 20 billion parameters is "fed" to the fifth generation of extremely strong ®️scalable processors, the latency will be as low as 100ms!

In other words, running a big model on CPU now really tastes better.

This is just a corner of Intel's launch, including Core ™️Ultra, which broke its ancestral tradition and has been described as the most significant architectural shift in four decades.

This also injects AI's power into consumer PC to speed up local AI reasoning.

In addition, AI practical applications that Intel has long taken root in various industries, including databases, scientific computing, generative AI, machine learning, cloud services, etc., with the arrival of the fifth generation of extremely strong ®️scalable processors, with the help of its built-in accelerators such as Intel ®️AMX, Intel ®️SGX / TDX and other built-in accelerators, it has been more cost-effective and efficient.

All in all, throughout the release of Intel, AI can be said to run through all the time.

# # the latest Intel processor, AI is more Power

Let's take a closer look at the more details disclosed by the fifth-generation Xeon ®️scalable processor.

For example, in terms of performance optimization, Intel has improved various parameters as follows:

The number of CPU cores has increased to 64, with higher performance per core, and each kernel has AI acceleration.

Speed up UPI with new I / O technologies (CXL, PCIe5)

Increased memory bandwidth from 4800 MT/s to 5600 MT/s

Let's make a vertical comparison with the previous two generations of Intel products, and the result of the performance improvement is as follows:

Compared with the previous generation products, the average performance of the same thermal design power consumption is improved by 21%; compared with the third generation products, the average performance is improved by 87%.

Compared with the previous generation of products, the memory bandwidth is increased by as much as 16%, and the tertiary cache capacity is nearly tripled.

It is not difficult to see that the fifth-generation Xeon ®️scalable processors have a lot of improvement in specification and performance compared with their predecessors.

But Intel is not just disclosing, but has used the fifth generation of extremely strong ®️scalable processors and demonstrated real results.

For example, in terms of large model reasoning, JD.com Yun demonstrated at the scene the capabilities of a new generation of self-developed servers with the fifth generation of extremely strong ®️scalable processors.

All with more than 20% performance improvement "posture" appearance!

Specifically, Jingdongyun and the previous generation of self-developed servers have the following performance improvements:

The performance of the whole machine is improved by 123%.

AI computer vision reasoning performance improved to 138%

The reasoning performance of Llama 2 has been improved to 151%.

This also proves once again that building big models on the top five generations of ®️is becoming more and more popular.

In addition to the large model, there are also the same measured results in various subdivision fields related to AI, such as whole computer power, memory broadband, video processing and so on.

The result comes from the Volcano engine, which uses the fifth-generation Intel ®Xeon ®scalable processor--

With its newly upgraded third-generation elastic computing example, the computing power of the whole machine has increased by 39%, and the application performance has been improved by up to 43%.

And on the basis of the performance improvement, according to the volcano engine, through its unique tidal resource pooling capability, a million nuclear elastic resource pool has been built, which can provide a volume-based experience at an approximate monthly cost, and the cost of going to the cloud is even lower!

This is due to an average 10-fold increase in performance per watt when using accelerators built into fifth-generation Xeon ®️scalable processors, and energy-efficient SKU optimized for workloads while consuming as little as 105W.

It can be said to be down-to-earth to reduce costs and increase efficiency.

In terms of cloud computing and security, it is also Ariyun, a big domestic manufacturer, that shows the measured experience.

After equipped with the fifth-generation Intel ®strong ®scalable processor and its built-in Intel ®AMX and Intel ®TDX acceleration engine, Aliyun created an innovative practice of "generative AI model and data protection", which significantly improved the security and AI performance of the 8th generation ECS instances, while keeping the instance price unchanged.

This includes a 25% improvement in reasoning performance, a 20% improvement in QAT encryption and decryption performance, a 25% improvement in database performance, and a 15% improvement in audio and video performance.

It is worth mentioning that the built-in Intel ®️SGX / TDX can also provide enterprises with stronger and easier-to-use application isolation and virtual machine (VM) isolation and confidentiality, respectively, providing an easier path for existing applications to migrate to a trusted execution environment.

And the fifth generation Intel ®strong ®extensible processor is compatible with the previous generation in software and pins, and can greatly reduce testing and verification work.

Generally speaking, the fifth-generation Xeon ®scalable processor is "full of sincerity" and outstanding performance, and what is revealed behind it is that Intel has always attached great importance to landing in the AI field.

# # behind is a history of AI landing

In fact, as a server / work-side chip, Intel ®Xeon ®scalable processors have tried to accelerate AI by using the vector computing power of Intel ®️AVX-512 technology since the first generation of products in 2017, and the introduction of Deep Learning acceleration Technology (DL Boost) into the second generation Xeon ®️scalable processors in 2018 has made it synonymous with "CPU running AI". In the evolution of the third-to fifth-generation extremely strong ®️scalable processors, from the addition of BF16 to the arrival of Intel ®️AMX, it can be said that Intel has been ploughing on the road of making full use of CPU resources, so that each generation of processor CPU can support various industries to promote AI combat.

At first it was in traditional industries.

For example, the second generation of Zhi Qiang ®️is engaged in intelligent manufacturing to help enterprises solve the challenges of massive real-time data processing, improve the efficiency of the production line system, and complete the "visible" capacity expansion.

Subsequently, Xeon ®scalable processors began to develop their skills in the world of large models.

In the upsurge of protein folding prediction triggered by AlphaFold2, the third-and fourth-generation extremely strong ®processors continue to optimize end-to-end throughput capabilities. Achieve a more cost-effective acceleration solution than GPU and directly lower the entry threshold for AI for Science.

This includes Intel ®AMX, an innovative AI acceleration engine built into CPU for deep learning applications from the fourth generation. As a matrix-related accelerator, it can significantly accelerate deep learning reasoning and training based on CPU platform, improve the overall performance of AI, and has good support for INT8, BF16 and other low-precision data types.

At the same time, the application of OCR technology in the large model era has also been given a new "soul" by the fourth generation of extremely strong ®extensible processors, with soaring accuracy and lower response latency.

Similarly, not long ago, with the optimization of the fourth-generation Xeon ®️extensible processor on NLP, a large language model specializing in the medical industry was successfully deployed in medical institutions at a lower cost.

Under the general trend of AI technology going deeper and deeper into various industries, the strong ®extensible processor shows us that the CPU solution it represents can make a difference and enable many AI applications to blossom on the CPU platform with more extensive deployment, easier access and lower application threshold.

The release of the fifth generation of Xeon ®extensible processors takes this process a step further.

Of course.

Behind this achievement, it is true that there is a demand for "running AI on CPU", as well as its own extremely profound value and advantages.

First of all, the demand, whether it is the traditional enterprises to promote intelligent transformation, or the vigorous development of emerging technologies such as AI for Science, generative AI, all need strong computing power to drive.

But the situation we are facing is that specialized acceleration chips are in short supply, procurement is difficult to say, and the cost is also very high, so it is far from popular enough.

So some people naturally turn their attention to CPU:

If this hardware, which is the most "accessible" in reality, is directly used, will it not get twice the result with half the effort?

This leads to the value and advantages of CPU.

Take the current topic-generating AI, for example, if you want to popularize this capability in a production environment, you need to control costs as much as possible.

Compared with training, AI's reasoning requires less numeric resources, and it can be handed over to CPU-not only with lower latency, but also with higher energy efficiency.

Like some industries and businesses, the reasoning task is not so onerous, and the choice of CPU is undoubtedly more cost-effective.

In addition, the direct deployment of CPU allows enterprises to make full use of the existing IT infrastructure and avoid the deployment difficulties of heterogeneous platforms.

Above, we can also understand: the introduction of AI acceleration in traditional architecture is the new fate of CPU in this era.

What Intel does is try its best to help you dig out and release the value.

# # harness the entire AI pipeline, and not just CPU

Finally, let's go back to today's protagonist: the fifth generation Intel ®Xeon ®Extensible processor.

To be honest, compared with specialized GPU or AI acceleration chips, it may not be cool enough, but it is user-friendly and easy to use (out of the box, the software and ecology are more and more perfect).

What is more noteworthy is that even when there are dedicated accelerators, CPU can become a part of AI pipeline, whether from data preprocessing, model development and optimization, to deployment and use.

Especially in the stage of data preprocessing, it can be called the existence of the protagonist.

Whether in terms of GB or TB, or even larger data sets, servers based on Xeon ®scalable processors can provide efficient processing and analysis by supporting more memory and reducing I / O operations, saving time for this most trivial and time-consuming task in AI development.

Based on the above, we have to sigh that when Intel talks about AI, the topics are more diverse.

Coupled with its layout on the GPU and special AI acceleration chips, there are more options in the "arsenal" and more comprehensive firepower coverage.

There is no doubt that all this points to Intel's determination to accelerate AI in an all-round way.

That is, using a series of cost-effective product portfolio to quickly meet the landing needs of AI in different industries.

The era of AI landing has begun, and Intel's opportunity has come?

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.