Wave information release "Source 2.0" basic model, hundreds of billions of parameter code open source

2024-02-27


Thank you, Mr. Air, a netizen of, for your clue delivery! November 27 news, Tide Information today released the "Source 2.0" basic model, and announced a comprehensive open source. The basic model of Source 2.0 includes three kinds of parameter scale models: 102.6 billion, 51.8 billion and 2.1 billion.

According to reports, Source 2.0 reduces the proportion of Internet corpus content through the use of high-quality Chinese and English materials such as Chinese and English books, encyclopedias and papers. In order to obtain Chinese mathematical data, Chaochao Information has cleaned the Internet data of about 10PB since 2018, but only about 10GB of mathematical data has been obtained.

In order to obtain relatively scarce high-quality Chinese mathematical and code data sets more efficiently, Source 2.0 adopts data production and filtering methods based on large models, which not only ensures the diversity of data, but also improves the data quality in each category.

In terms of computing power, Source 2.0 adopts the method of non-uniform pipelining parallelism, and comprehensively uses the strategy of pipelining parallelism + optimizer parameter parallelism + data parallelism, which makes the distribution of apparent memory consumption of the model in each stage of pipelining parallelism more balanced and avoids the problem of reduced training efficiency caused by visual memory bottleneck.

Source 2.0 tests the ability of code generation, mathematical problem solving and factual question and answer in the evaluation, and the test results show that the overall performance of Source 2.0 is in the upper-middle level.

Source 2.0 adopts a comprehensive open source strategy, and a full range of model parameters and code are available for free download and use. has a GitHub page and a link to the paper:

Open source code links:


Links to papers:


