AICC AI Infra technological Innovation Round Table talking about the Golden weapon in the era of Big Model 07/12 Update SLTechnology News&Howtos

AICC AI Infra technological Innovation Round Table talking about the Golden weapon in the era of Big Model

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Shulou(Shulou.com)12/24 Report--

The emergence of intelligent application of large model in 2023 has brought about the explosion of large model engineering practice. In the further implementation and application process of large model, AI Infra, as the middle layer infrastructure connecting hardware and upper application, is undoubtedly a key link.

Recently, at the AICC 2023 Artificial Intelligence Computing Conference, Fang Yuyang, editor-in-chief of quantum bits, presided over the roundtable forum of "AI Infra: Digging Tools in the Age of Large Models," and collided with Zhu Hong, AI Application Architect of Inspur Information, Liang Shuang, Vice President of Luchen Technology, Liu Daoquan, Founder and CEO of Shizhi AI wisemodel, and Li Feng, Vice President of Wuxin Dome Business, centering on key issues such as AI Infra concept, industry status, development challenges and diversified computing power in the era of large models.

Participants pointed out that AI Infra is the base supporting AI and large models, and large model training and reasoning are complex system engineering, which needs to be optimized in hardware, software, training, reasoning and other aspects and angles to solve the challenges of computing cost, training threshold and diversified computing power, carry forward open source ideas and promote the rapid development of artificial intelligence.

The following is a transcript of the roundtable forum:

Moderator: At present, the concept of AI Infra is not unified. Some people define it as all the hardware infrastructure needed by AI. Some people emphasize that it is the software stack between the computing layer and the application layer. How do you define AI Infra? What role does it play in the current AI industry?

Vermilion: From an industry perspective, AI Infra is considered a software layer on top of hardware. From the perspective of Inspur information, the hardware and software under the application layer can be incorporated into AI Infra, which can also be called AI middle station or AI platform.

AI Infra plays a connecting role in the entire AI industry, because AI is driven by computing power, and the performance of computing power depends on the AI Infra layer.

Liang Shuang: I think AI Infra includes hardware and software. Large models generally need distributed training on thousands of computing cards. If users adopt native software and hardware solutions, memory overflow may occur in the case of massive parameters, and it is difficult to exert the utilization efficiency of hardware. Through data parallelism, tensor model parallelism, pipeline parallelism, etc., AI Infra provides customers with stronger computing power and can efficiently utilize distributed hardware when training large models. At the same time, the cost of training large models can be as high as tens of millions, and our AI Infra goal is to reduce training costs by half and time by half, which is also a point of concern for users.

Liu Daoquan: AI Infra also has a broader scope, in addition to the software and hardware systems mentioned above, but also including network, storage and other hardware and software. The training and reasoning of large models is a systematic engineering, which needs to be optimized at various levels such as calculation, network and storage in order to give full play to performance and efficiency.

From the perspective of the large model community, we are now gathering more models and datasets, and will also gather open source tools related to application development layer, model training, deployment and reasoning, so as to make it easier for everyone to obtain and use and improve work efficiency.

Li Feng: In our view, AI Infra is the base that supports AI technology represented by large models, including hardware, software, tool chain and optimization methods. It is an overall solution. Wuwen Core Dome has only been established for half a year. Before that, we did not make many public appearances. Many friends in the industry remember our team, which started from "M×N." On AI Infra, we focus on the overall solution of software and hardware integration, and do three-stage "M×N" intermediate layer products from algorithm to chip, from chip cluster to model, and then from model to application. On the one hand, we help AI developers overcome the influence of the current multi-heterogeneous computing power initial software ecology and heterogeneous computing power pool. On the other hand, we rely on our industry-leading AI computing optimization capability to help improve the computing power supply level and continuously reduce computing costs. Improve the landing efficiency of large models.

Host: With the arrival of the big model craze, everyone has a clearer understanding of the practice of large model engineering. Large model training and reasoning is a very complex thing, requiring a lot of infrastructure as support, and because of this, AI Infra is getting more and more attention, please talk about the challenges faced by large model applications.

Vermilion: Efficiency is the core challenge of large model application, including the delay and speed mentioned earlier. Inspur Information believes that efficiency needs to be viewed from both horizontal and vertical perspectives. First, vertical efficiency is how the efficiency of AI computing platform is brought into play, which is a point that everyone is very concerned about; second, horizontal efficiency, that is, stability, whether training or reasoning can run for a long time, which is a guarantee.

A lot of work of Inspur Information focuses on these two levels, that is, how to solve vertical and horizontal efficiency problems, and then promote the landing application of large models, which is the challenge and solution found in the process of serving customers.

Liang Shuang: For customers, the challenges of AI large model application include reasoning delay, reasoning speed, and how to reduce reasoning parameters, as well as some quantitative techniques. In terminal application scenarios, such as "smart cockpit," customers are more sensitive to hardware requirements. Now most intelligent driving adopts high-pass chips. Whether large model reasoning can be realized and reasoning can achieve the effect of mainstream accelerator card is very important for relevant applications. The hardware computing power of this terminal scenario is limited, and it involves model compression technology and reasoning optimization. We are also doing research and development work on reasoning.

Liu Daoquan: The biggest problem in the application of large models is the disconnection between applications and models. Because ultimately at the application level, it needs to be considered from a business perspective. The reality is that most of the application personnel do not understand the model, and most of the model personnel also have difficulty understanding the actual application scenario. Large model manufacturers are thinking about how to improve the general ability of the model, but the understanding and cognition of the application may not be enough. Whether it is ToB application or ToC application, each link and process has a lot of business knowledge in it. How to combine this part of business knowledge with model capability requires application and model developers to participate together. Only in this way can we really solve problems and make good applications.

At present, the intelligent AI starts from the community and can understand more about the industry requirements, whether it is the requirements of the application end or the model layer, and finally hopes to get through the links of large model application development, so that the subsequent application development links no longer need to care about how to adjust the model, and the model-related transactions can be basically automatically completed on the platform, so as to realize the separation of application and model. There are many intermediate links involved here. We can also cooperate with Luchen Technology, Wuwen Core Dome and other intermediate parties to string up the intermediate links together so that more people can use the large model more conveniently.

In this process, the open source community has an important position and role. The open source community is a repository of industry information that plays a role in bridging the architecture from applications to the underlying framework to the underlying chips. Large models and intermediate tool software ultimately need to be applied to create value, and the whole life is always inseparable from the role of community as a link between the preceding and the following. We won't do the application ourselves in the future, or hope to unite more partners in the middle to get through the link of large model application development, finally make the application simple, and make AI landing simpler.

Li Feng: Because the landing cost of large models is very high and reasoning is very expensive, most people can't accept this price. We take advantage of the advantages of soft and hard integration to reduce the cost first, and secondly, soft and hard integration can exert the potential of heterogeneous computing power. We can lower the training threshold of model development and let more creators have the ability to enter this field. This is our consideration.

In addition, if the large model wants to really land in the industry, it also needs industry data. At this time, it must be a combination of soft and hard in order to make a complete plan for the implementation of the industry, rather than relying solely on models. Because the landing of a model is not enough to realize a scene.

Host: It seems that the core key to the application and popularization of large models is "efficiency." All guests have rich first-line practical experience. Please share the key points to really lower the popularization threshold of large models. You can talk about technology or ecological fields.

Liang Shuang: The open source large model framework system is our practical action to promote the application and popularization of large models, and also fills the gaps in relevant domestic technologies. The reason why AI is in full swing is inseparable from the open source spirit and countless open source community contributors. Luchen Technology open source large model framework, but also hope to be able to share research and development results to everyone, so that AI can develop better, lower AI threshold, improve productivity.

Liu Daoquan: First of all, to solve the problem of disconnection between the application and the model mentioned just now, the key is to promote the interaction between the application layer and the model layer, which requires more people in the application scenario to participate in the development of large model applications.

Second, scenarios with better data quality are also easier directions for large models to land, such as banking, finance, e-commerce and other fields, as well as scenarios where IoT data collection and automation have been realized in the industrial field. Generally speaking, with high-quality data, there will be a better foundation for large model applications to land.

Third, at present, the focus of large model application is more in the field of AI technology, while the mining of application requirements for core scenarios is not enough. In the future, more exploration will be done in the application scenarios and demand directions.

Moderator: In addition to the progress of open source software, including the framework level, we are now facing a big problem, that is, the shortage of computing power, from the perspective of software and hardware integration, what can be improved?

Vermilion: Open source is indeed a key step in promoting the development of AI industry and promoting the landing of the industry. It is also a great promotion for the whole industry. Inspur Information is also trying to push its own work out in a similar way, accelerating the application and popularization of large models and lowering the threshold for industry application.

Li Feng: In view of the problem of shortage of computing power, we should first "make better use of the computing power that can be used." More quantification can be carried out from the reasoning end, and the reasoning efficiency can be improved by reducing the storage space and calculation requirements of the model, so that more models can be run with the same computing power. Second,"use the previously unused computing power", that is, consider heterogeneity for model training and use more computing power through heterogeneous computing platforms.

Host: The current large model training and other bottom support are facing the challenge of computing power diversification. Now, from the AI Infra level, it is still necessary to do more diversified adaptation. Do you currently have a technical profile?

Vermilion: Multi-computing support is a very hot topic now, and it is also the direction of continuous attention of Inspur information. After we released the "Source 1.0" large model two years ago, we began to consider adapting more reasoning hardware in the reasoning stage, and also landed some practical work, which was able to efficiently reason and run the 10 billion scale model at that time. Now Source 2.0 is also doing various hardware adaptations.

Of course, model training is also our focus, we are also doing more optimization work with more potential computing providers, and we have also introduced open source work. The goal is to hope that whether it is a commercial software and hardware solution or a hardware + open source software solution, we can support users to train and reason quickly and well.

Liang Shuang: We have adapted a lot of hardware. Compared with foreign countries, there is indeed a gap in the number of operators and ecosystems in domestic multivariate computing power. At this time, it is necessary for multi-computing manufacturers and users to do R & D adaptation together, and strive to catch up as soon as possible.

Liu Daoquan: In fact, the adaptation work is not done by ourselves. At present, our exploration with some multi-computing manufacturers is more ecological cooperation. The community can be used as a good entrance, so that everyone can experience the ability of multi-computing power first, which is also more important, especially for many small and medium-sized enterprises with partial application, many of which may not have used too much computing power. Only after experience can we better understand the relevant chip's ability in reasoning and training.

Li Feng: The core of our layout is the "M×N" middle layer. Among them,"M" and "N" both refer to multi-adaptation. They support multiple chips on the hardware side and multiple large models on the model side. This middle layer has the capabilities of large model inference engine, training engine, heterogeneous computing power evaluation, etc., so that the large model algorithm can run on multiple chips and realize optimal training and inference efficiency, which is equivalent to building a bridge between large models and different chips.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.