Community Articles

via Decrypt · By Decrypt Editorial

This AI Reads Your Chemistry Instructions and Finds the Best Way to Build You a Molecule

JST

JST/USDT

$0.0808
-5.05%
24h Volume

$7,894,448.36

24h H/L

$0.08585 / $0.08022

Change: $0.005630 (7.02%)

Funding Rate

+0.0033%

Longs pay

Data provided by COINOTAG DATALive data
JST
JST
Daily

$0.08108

-4.77%

Volume (24h): -

Resistance Levels
Resistance 3$0.0908
Resistance 2$0.0855
Resistance 1$0.0823
Price$0.08108
Support 1$0.0796
Support 2$0.0760
Support 3$0.0708
Pivot (PP):$0.082297
Trend:Uptrend
RSI (14):54.7
DE
Decrypt Editorial
(09:31 PM UTC)
3 min read
EW
Updated byEmily Watson
1092 views
0 comments

In brief

  • Synthegy, developed at EPFL, uses LLMs to rank synthesis routes against chemist-defined goals, matching expert judgments 71.2% of the time.
  • The framework was validated against 36 independent chemists across 368 evaluations.
  • The experiments reached alignment rates comparable to inter-expert agreement.

Designing a molecule from scratch is one of chemistry's hardest problems. It's not just about knowing what atoms to connect—it's about knowing the right order of reactions, when to protect sensitive parts of the molecule, and how to avoid dead ends that could ruin months of lab work.

Traditionally, that knowledge lives in the heads of experienced chemists. Now, a team at EPFL wants to put it into a language model.

Researchers led by Philippe Schwaller published a paper this week in Matter describing Synthegy, a framework that uses large language models as reasoning engines for chemical synthesis planning. The key insight is subtle but important: rather than asking AI to generate molecules, the team uses AI to evaluate synthesis routes that traditional software already produces.

Here's how it works: A chemist types in a goal in plain English, something like "form the pyrimidine ring in the early stages." Existing retrosynthesis software—which works by breaking target molecules into simpler pieces—then generates dozens or hundreds of possible synthesis routes.

Synthegy converts each route into text and hands it to an LLM, which scores every route on how well it matches the chemist's instruction. The best ones float to the top, with written explanations of why.

"When making tools for chemists, the user interface matters a lot, and previous tools relied on cumbersome filters and rules," said Andres M. Bran, lead author of the study, in a statement from EPFL.

The system was validated in a double-blind study involving 36 independent chemists who reviewed 368 route pairs. Their selections matched Synthegy's 71.2% of the time, a number that's roughly in line with how often expert chemists agree with each other. Senior researchers (professors and research scientists) agreed with Synthegy more often than PhD students, suggesting the system captures the same strategic intuitions that come with experience.

The researchers tested several AI models, including GPT-4o, Claude, and DeepSeek-r1. AI has been making inroads in drug discovery for years, but most approaches focus on narrowly trained models for specific tasks. Synthegy is designed to be modular—it can plug into any retrosynthesis engine on the backend, and any capable LLM on the reasoning side. Gemini-2.5-pro scored highest in the benchmark, while DeepSeek-r1 seems to be a strong open-source alternative that can run locally.

The framework also handles a second problem: reaction mechanism elucidation. This is the question of why a chemical reaction happens—what electron movements take place at each step. Synthegy breaks reactions into elementary moves and has the LLM assess each candidate step for chemical plausibility. On simple reactions like nucleophilic substitutions, the best models achieved near-perfect accuracy.

The potential use cases are broad. Drug discovery is the obvious one. AI has already shown promise predicting cancer treatment outcomes, but the same approach applies anywhere chemists need to design new materials or optimize industrial reactions. One practical detail: evaluating 60 candidate routes with Synthegy takes roughly 12 minutes and costs about $2–3 in API fees.

The paper acknowledges current limits. LLMs sometimes misread the direction of a reaction in its text representation, leading to wrong feasibility calls. Smaller models perform no better than random guessing. Routes longer than 20 steps are harder to track coherently.

The code and benchmarks are publicly available at github.com/schwallergroup/steer.

Add COINOTAG as a Preferred Source

Add COINOTAG to your preferred sources in Google News and Search to see our coverage first.

Add on Google

Source

Decrypt Editorial · Decrypt

Read original →

Comments
Comments
Other Community Articles