GPT-4 is an evil chemist! China University of Science and Technology, Microsoft Research released the first "scientific risk" benchmark and SciGuard model 08/21 Update SLTechnology News&Howtos

GPT-4 is an evil chemist! China University of Science and Technology, Microsoft Research released the first "scientific risk" benchmark and SciGuard model

2025-08-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Shulou(Shulou.com)12/24 Report--

A joint team from institutions such as the University of Science and Technology of China has proposed a new method-SciGuard, which can protect AI for Science models from improper use of biological, chemical, pharmaceutical and other domain models. At the same time, the team established the first benchmark test, SciMT-Safety, that focuses on safety in the field of chemical science.

"our experiment is out of control! This is the end of our own creation! -"the day after tomorrow" (The Day After Tomorrow)

In science fiction movies, mad scientists are usually the protagonists of doomsday disasters, and the rapid development of AI technology seems to bring this scene closer and closer to us.

Global attention to the potential threat of AI is more focused on general artificial intelligence and various multimedia generation models, but more important is how to regulate "AI scientists", that is, those fast-growing scientific models.

In order to meet this challenge, the joint team from China University of Science and Technology, Microsoft Research and other institutions deeply analyzed the risks of various AI models in the field of Science, such as biology, chemistry, drug discovery and so on, and demonstrated the harm of AI abuse in chemical science through practical cases.

Link to paper: the https://arxiv.org/ abs / 2312.06632 research team found that some of the existing open source AI models can be used to create hazardous substances and circumvent laws and regulations.

In response to this phenomenon, the researchers developed an agent called SciGuard to control the risk of AI abuse in the Science field, and proposed the first red team benchmark focused on science security to evaluate the security of different AI systems.

The experimental results show that SciGuard shows the least harmful effect in the test, while maintaining good performance.

The potential risks of AI in the field of Science recently, the latest research from China University of Science and Technology and Microsoft Research has found shocking results: the open source AI model can find a new way to bypass regulation, giving the synthesis path of hydrogen cyanide and VX nerve gas, two infamous chemical weapons!

Hydrogen cyanide is a highly toxic substance, and the traditional reaction to produce hydrogen cyanide requires highly regulated raw materials and extremely stringent reaction conditions (such as temperatures above 1000 degrees Celsius).

However, in figure 1, by using an open source AI model called LocalRetro, they found a synthesis path that is cheap, readily available, and easy to operate.

Similarly, the model has succeeded in finding new synthetic pathways for the production of VX nerve gas that have not been reported, which may bypass existing regulations on raw materials.

Figure 1: the open source AI model presents a new regulatory evasive response path for hydrogen cyanide and VX nerve gas. At the same time, the research team also pointed out that the large language model has also become a powerful scientific tool, greatly reducing the threshold of knowledge.

Figure 2 shows an example of obtaining hazard information using a large language model.

With the development of technology, agent, which focuses on large language models, has the ability to automate scientific tasks, such as ChemCrow. If this kind of agent does not carry on the risk management very carefully, it is easy to cause greater danger.

In order to prevent bad effects, the team has hidden dangerous information in the open version of the paper.

Figure 2:GPT-4 shows how explosive PETN is synthesized. In figure 3, the researchers list nine potential risks that AI may pose in the field of science, including discovery of harmful substances, discovery of harmful uses, evasion of regulation, side effects, misleading information, infringement of intellectual property rights, disclosure of privacy, and biases that may lead to scientific research.

With the evolution of time and AI, these risks are also evolving, and people need to pay attention to and evaluate new risks all the time.

Figure 3: the researchers listed nine potential risk SciGuard models of AI in Science to address these challenges, the team proposed a large language model-driven agent called SciGuard to help the AI For Science model manage risks.

SciGuard aligns with human values and adds a variety of scientific databases and regulatory (hazardous compounds) databases.

Moreover, the agent can use a variety of scientific tools and AI4Science models to provide additional information to assist SciGuard in judging the user's intention.

The core of SciGuard, the framework of figure 4:SciGuard, is the powerful large language Model (LLM), which can not only understand and generate human language, but also deal with and help decompose complex scientific problems. SciGuard has a set of security principles and guidelines tailored to the field of science.

These principles and guidelines take into account a variety of risk factors that may be encountered in scientific research, including, but not limited to, the secure handling of high-risk substances, the maintenance of data privacy and compliance with laws and regulations.

To implement these security principles and guidelines, SciGuard uses recognized scientific databases such as PubChem to build its long-term memory bank. This memory bank contains a large amount of data about chemicals and their potential hazards.

With this data, SciGuard can conduct an in-depth risk assessment of users' queries. For example, when a user inquires how to synthesize a compound, SciGuard can quickly retrieve information about the compound, assess its risk, and provide safe advice or warning, or even stop responding accordingly.

In addition to the database, SciGuard also integrates a variety of scientific models, such as chemical synthesis route planning model and compound attribute prediction model. These models enable SciGuard to help users accomplish specific scientific tasks.

At the same time, these models can also provide additional contextual information for SciGuard. For example, SciGuard will use property prediction models to evaluate various properties of compounds, such as solubility, toxicity or flammability, to assist risk assessment.

Another key technology of SciGuard to deal with complex tasks is the famous Chain of Thought (CoT) method. CoT allows SciGuard to plan every step of a task in an iterative way. This approach allows SciGuard to break down complex tasks and ensure that each action meets safety and ethical standards when performing tasks.

Through these technical characteristics, SciGuard can not only effectively control the risk of scientific AI model, but also improve the efficiency and safety of scientific research. The development of this system not only ensures the free exploration and innovation of scientific research, but also provides a powerful example for ensuring the safe and rational use of artificial intelligence.

In order to measure the safety level of the large language model and science agent, the research team proposed the first safety question-and-answer benchmark--SciMT-Safety focused on the chemical and biological sciences, including combustibles, corrosive substances, explosives, microbes, high-risk pesticides, addictive substances and biological toxicity.

Figure 5: test results of mainstream models the research team tested GPT-4,GPT-3.5, Claude-2, Llama2-7B-Chat, Llama2-13B-Chat, PaLM-2, Vicuna-7B, Vicuna-13B, Mistral-7B and ChemCrow agent. The above figure shows the final test results. SciGuard has the best defense effect on the scientific security test set proposed by the team.

Llama has achieved good results in benchmark, but surprisingly, PaLM-2 is apt to give some dangerous answers.

In the two specific examples in figure 6:benchmark, the author shows two examples. In the face of malicious questions, all LLM and agent "honestly" provide harmful information (partly by mosaics), and only SciGuard holds the bottom line.

Call for attention in this increasingly dependent on high-tech era, the progress of AI technology has brought unlimited possibilities, but also accompanied by unprecedented challenges.

This study is not only a profound reflection on the development of science and technology, but also a call for the responsibility of the whole society.

At the end of the paper, the authors strongly appeal that the global science and technology community, policy makers, ethicists and the public should work together to strengthen the supervision of AI technology, constantly improve related technologies, and form a broad consensus.

We need to actively promote the development of AI4S model at the same time, effectively control the potential risks brought by technology, to ensure that the progress of science and technology is not only a technological upgrade for human beings, but also the promotion of social responsibility and ethics. Only in this way can we really move towards a future guided by wisdom and morality.

Reference:

Https://arxiv.org/abs/2312.06632

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.