AI overcame Tao Zhe Xuan's mathematical problem for the first time: DeepMind milestone algorithm boarded Nature,LLM to realize self-evolution of search code. 08/15 Update SLTechnology News&Howtos

AI overcame Tao Zhe Xuan's mathematical problem for the first time: DeepMind milestone algorithm boarded Nature,LLM to realize self-evolution of search code.

2025-08-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Shulou(Shulou.com)12/24 Report--

The upper bound set problem that has plagued mathematicians for many years and made Tao Zhe-Xuan say he likes is unexpectedly solved by DeepMind's new algorithm. This is the first algorithm discovered with LLM, which can be called a milestone research, and Nature will be published as soon as it is released.

The problem of upper bound set is an open problem that has perplexed mathematicians for many years.

Tao Zhixuan, a famous mathematician, once described the upper bound set problem as his favorite open problem.

Tao Zhe Xuan blog and big language model, unexpectedly made a new discovery on this issue.

Today, researchers from Google DeepMind, the University of Wisconsin-Madison and the University of Lyon teamed up to come up with a new method, FunSearch, which uses LLM to discover open problems in mathematical science for the first time.

AI searches for "functions" written by computer code, hence the name FunSearch.

Https://www.nature.com/ articles / s41586,023-06924-6 simply put, FunSearch pairs the pre-trained LLM with an automatic "evaluator". The goal of the former is to provide creative solutions in the form of computer code, while the latter prevents hallucinations and wrong ideas.

By iterating back and forth between the two components, the initial solution "evolves" into new knowledge.

DeepMind sent out the unedited version in order for everyone to witness this historic moment.

Nature press release is even more blunt: DeepMind's AI is better than human mathematicians in unsolved puzzles!

This is the first time that human beings have used LLM to challenge open problems in science or mathematics, and made new discoveries.

In addition, in order to prove the practicability of FunSearch, DeepMind experts also try to use it to solve the "packing problem", which is widely used and can improve the efficiency of the data center.

For this problem, FunSearh also found a more effective algorithm.

DeepMind experts say that scientific progress depends heavily on the ability to analyze new knowledge, and FunSearch is a powerful scientific tool because the programs it outputs not only propose solutions, but also reveal how solutions are built.

In this way, scientists using FunSearch can be further inspired by new ideas and enter a virtuous circle of "improvement-discovery".

LLM promotes scientific discovery through "evolution". Big models are best at solving problems, but can they discover entirely new knowledge?

Because LLM cannot avoid "hallucinations" outputting incorrect information, it is very difficult to rely on them to make new discoveries that are in fact correct.

But what if we can identify and expand LLM's best ideas and maximize its creativity?

FunSearch uses the power of large models to develop and retain the best creative ideas in an "evolutionary" way.

These ideas are expressed in computer code and can be run and graded automatically.

First, the user describes the problem in the form of code. This description includes a process for evaluating programs and a seed program for initializing the program pool.

FunSearch is an iterative program. At each iteration, the system selects some programs from the current program pool and inputs them into LLM.

On this basis, LLM creatively generates new programs and automatically evaluates them.

The programs with the highest scores are added back to the existing program pool, forming a self-improving cycle.

It is worth mentioning that FunSearch uses Google's PaLM 2, but it is also compatible with other code-trained LLM.

Schematic diagram of the overall flow of FunSearch: show LLM the best program it has generated so far (retrieved from the program database) and ask for a better program to be generated. The procedures proposed by LLM are automatically executed and evaluated. The best programs are added to the database for selection by subsequent loops. Users can retrieve the programs with the highest scores so far at any time. Discovering new mathematical knowledge and algorithms in different fields is a well-known and arduous task. To a large extent, this is far beyond the capabilities of the most advanced AI systems.

To solve such challenges with FunSearch, DeepMind researchers have introduced a number of key components.

Instead of starting with zero, the process of "evolution" begins with common sense about the problem, allowing FunSearch to focus on finding the most critical ideas to make new discoveries.

In addition, the evolutionary process uses a strategy to increase the diversity of ideas to avoid stagnation. Finally, the DeepMind team runs the evolution process in parallel, thus improving the efficiency of LLM.

The epoch-making mathematical discovery upper bound set problem is an open challenge, which has perplexed mathematicians in many research fields for decades.

This time, DeepMind researchers worked with Jordan Ellenberg, a professor of mathematics at the University of Wisconsin-Madison, and Professor Ellenberg made an important breakthrough on the upper bound set.

One of the keys to the problem of https://arxiv.org/ abs / 1605.09223 upper bound set is to find the largest set of points (that is, the upper set) in a high-dimensional network, in which no three points can be on the same line.

The upper bound set problem is so important because it can be used as a model for other problems in extreme combinatorics, which study the maximum and minimum size of a set of numbers, graphics, or other objects.

However, to solve the upper limit set problem, the calculation method of brute force must not work, because there are so many possibilities to consider that it will soon exceed the number of atoms in the universe.

Tao Zhe Xuan's explanation of why the upper bound set problem is important, FunSearch generated a solution in the form of a program, and in some settings, found the largest upper limit set ever.

This discovery represents the largest increase in the upper limit in the past 20 years!

Moreover, FunSearch outperforms state-of-the-art computational solvers because the problem extends far beyond the current capabilities of computational solvers.

The interactive diagram below shows the evolution from the seed program at the top to the new function with a higher score at the bottom.

Each circle is a program whose size is proportional to the score assigned to it. On the right is the corresponding function generated by FunSearch for each node. (for the complete program of the function, please refer to the original paper)

Interactive experience link: https://storage.googleapis.com/ deepmind-media / DeepMind.com/ Blog / funsearch / index.html

The above results show that FunSearch technology has the ability to break through the established research results of complex combinatorial problems. In such problems, it is often very difficult to establish intuitive understanding.

The researchers say they expect this approach to contribute to new discoveries in other similar theoretical issues of combinatorics, and even open up new possibilities in the field of communication theory in the future.

FunSearch opens the "black box" and works with mathematicians to become a model. FunSearch prefers programs that are simple and can be explained manually.

Although discovering new mathematical knowledge is important in itself, FunSearch has additional advantages over traditional computer search techniques.

This is because FunSearch is not just a "black box" that generates solutions to problems.

Instead, it generates programs that describe how these solutions are implemented.

This method of "demonstrating working processes" (show-your-working), similar to the way scientists work, can better explain and reproduce newly discovered processes.

FunSearch prefers solutions represented by "highly compact programs"-solutions with low Kolmogorov complexity (Kolmogorov complexity).

Short programs can describe very large objects, so that FunSearch can be extended to find small targets in huge amounts of data.

In addition, it makes it easier for researchers to understand FunSearch's program output.

Professor UW-Madison, an American mathematician and author of the paper, Jordan S. Ellenberg, said, "FunSearch provides a new mechanism for formulating attack strategies. The solution generated by FunSearch is conceptually much richer than a simple list of numbers. When I studied them, I learned something.

More importantly, this interpretability of FunSearch programs can provide actionable insights for researchers.

For example, when using FunSearch, there is intriguing symmetry in some of its high-score output code.

This allows researchers to have a new understanding of the problem, and use this insight to improve the problems introduced in FunSearch, so as to come up with better solutions.

According to DeepMind, "this is a model of collaboration between humans and FunSearch on many mathematical problems."

Left: by examining the code generated by FunSearch, the researchers gained more actionable insights (highlighted). Right: the original "acceptable" collection built using the shorter program on the left. Since the major challenge of solving the "packing problem" in the computer field can be successful in theory, DeepMind researchers try to explore the flexibility of FunSearch in the field of computer science.

Application in computer science is an important practical challenge to explore the flexibility of new methods.

Here, a challenging "packing problem" (bin packing) is used, in which items of different sizes are packed into a minimum number of boxes or containers.

This problem is at the core of solving many practical problems, from loading goods into containers to distributing computing operations in data centers to minimize costs.

The online packing problem is usually solved by using the algorithm rule of thumb (heuristic) based on human experience.

However, finding a set of rules for each specific case of different size, time, or capacity can be challenging.

Although it is quite different from the upper bound set problem, it is easy to set up FunSearch for this problem.

FunSearch provides an automatic customized program to adapt to the specific situation of the data, and its performance is better than the previous heuristic method-using fewer boxes to pack the same number of items.

Examples of existing heuristics: best fit heuristics (left) and FunSearch heuristics (right) "packing problem".

Difficult combination problems such as online packing can also be solved by other AI methods, such as neural networks and reinforcement learning. These methods have proved to be effective, but they are likely to require a lot of resources to deploy them.

The code output from FunSearch, on the other hand, can be easily checked and deployed, which means that this solution can be applied to real-world industrial systems with quick benefits.

LLM-driven discoveries in science and other areas FunSearch is designed to prevent LLM from "hallucinating".

The power of these models can not only help new discoveries in the field of mathematics, but also find the best solution to practical problems.

DeepMind believes that for many problems in science and industry, whether long-standing or new, using LLM-driven methods to generate effective and customized algorithms will be a common practice.

In fact, FunSearch's groundbreaking work is just the beginning.

As the scope of LLM expands further, FunSearch will naturally be improved.

At the same time, DeepMind will strive to expand its capacity to meet scientific and engineering challenges that societies urgently need to address.

Netizens argue that if all hallucinations are accurate, new insights will accelerate the discovery of basic science.

Others say that the threshold for AGI is to make new discoveries, so I guess we already have AGI.

In 2007, Tao Zhe-Xuan, the world's greatest mathematician, called the "upper bound set problem" his favorite open question. Now, Google's DeepMind FunSearch has successfully solved this problem.

LLM can't find anything new, they're just random parrots. FunSearch can actually find new and useful things in math and computer science.

This sentence clearly named LeCun himself.

So, when will the proof of P=NP be realized?

Reference:

Https://deepmind.google/discover/blog/funsearch-making-new-discoveries-in-mathematical-sciences-using-large-language-models/

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.