Unlocking chemistry with intelligence

GPT-4 shows promise as an aid to chemistry researchers, yet its limitations reveal the need for further improvements.

This story is featured in the Asia Research News 2024 magazine.   If you would like to receive regular research news, join our growing community. 

GPT-4, the latest version of the artificial intelligence system from OpenAI, the developers of Chat-GPT, demonstrates considerable usefulness in tackling chemistry challenges, but still has significant weaknesses. “It has a notable understanding of chemistry, suggesting it can predict and propose experimental results in ways akin to human thought processes,” says chemist Kan Hatakeyama-Sato, at the Tokyo Institute of Technology. Hatakeyama-Sato and his colleagues discuss their exploration of the potential of GPT-4 in chemical research in the journal Science and Technology of Advanced Materials: Methods.

GPT-4, which stands for Generative Pre-trained Transformer 4, belongs to a category of artificial intelligence (AI) systems known as large language models. These can gather and analyse vast quantities of information in search of solutions to challenges set by users. One advance for GPT-4 is that it can use information in the form of images, in addition to text.

Although the specific datasets used for training GPT-4 have not been disclosed by its developers, it has clearly learned a significant amount of detailed chemistry knowledge. To analyse its capabilities, the researchers sent the system a series of chemical tasks focused on organic chemistry – the chemistry of carbon-based compounds. These tasks covered basic chemical theory, the handling of molecular data, predicting the properties of chemicals, the outcome of chemical processes, and proposing new chemical procedures.

The results of the investigation were varied, revealing both strengths and significant limitations. GPT-4 displayed a good understanding of general textbook-level knowledge in organic chemistry. It was weak, however, when prompted with tasks dealing with specialised content or unique methods for making specific organic compounds. It displayed only partial efficiency in interpreting chemical structures and converting them into a standard notation. One interesting feat was its ability to make accurate predictions for the properties of compounds that it had not specifically been trained on. Overall, it was able to outperform some existing computational algorithms, but fell short against others. 

“The results indicate that GPT-4 can tackle a wide range of tasks in chemical research, spanning from textbook-level knowledge to addressing untrained problems and optimising multiple variables,” says Hatakeyama-Sato. “Inevitably, its performance relies heavily on the quality and quantity of its training data, and there is much room for improvement in its inference capabilities.”

The researchers emphasise that their work was only a preliminary investigation, and that future research should broaden the scope of the trials and dig deeper into the performance of GPT-4 in more diverse research scenarios.

They also hope to develop their own large language models specialising in chemistry and explore their integration with existing techniques. 

“In the meantime, researchers should certainly consider applying GPT-4 to chemical challenges, possibly using hybrid methods that include existing specialised techniques,” Hatakeyama-Sato concludes.

Further information 

Prof Teruaki Hayakawa
[email protected]
Tokyo Institute of Technology

Asst Prof Kan Hatakeyama-Sato
[email protected]
Tokyo Institute of Technology

Dr Yasufumi Nakamichi
[email protected]
Science and Technology of Advanced Materials: Methods (STAM-M) 

We welcome you to reproduce articles in Asia Research News 2024 provided appropriate credit is given to Asia Research News and the research institutions featured. 

Published: 28 Feb 2024

Contact details:

STAM editorial office

National Institute for Materials Science (NIMS) 1-2-1 Sengen, Tsukuba-city Ibaraki 305-0047 JAPAN

Academic disciplines: