AMORE: framework for chemical LLMs evaluation

Lost in Translation: Chemical LMs and the Misunderstanding of Molecule Structures

AIRI, Sber AI, Skoltech, HSE University, Sber AI Lab, ISP RAS Research Center for Trusted AI
EMNLP 2024 Findings ^*Indicates Equal Contribution

Abstract

The recent integration of chemistry with natural language processing (NLP) has advanced drug discovery. Molecule representation in language models (LMs) is crucial in enhancing chemical understanding. We propose \textbf{A}ugmented \textbf{Mo}lecular \textbf{Re}trieval (\heart AMORE), a flexible zero-shot framework that assesses trustworthiness of Chemical LMs of different natures: trained solely on molecules for chemical tasks and on a combined corpus of natural language texts and string-based structures. The framework relies on molecule augmentations that preserve an underlying chemical, such as kekulization and cycle replacements. We evaluate encoder-only and generative LMs by calculating a metric based on the similarity score between distributed representations of molecules and their augmentations. Our experiments on ChEBI-20 and QM9 benchmarks show that these models exhibit significantly lower scores than graph-based molecular models trained without language modeling objectives. Augmentation of SMILES representations leads to decreased performance on chemical property prediction tasks in the MoleculeNet benchmark. Additionally, our results on the molecule captioning task for cross-domain models, MolT5 and Text+Chem T5, demonstrate that our representation-based evaluation metrics significantly correlate with the classical text generation metrics like ROUGE and METEOR.

BibTeX

@inproceedings{ganeeva-etal-2024-lost, title = "Lost in Translation: Chemical Language Models and the Misunderstanding of Molecule Structures", author = "Ganeeva, Veronika and Sakhovskiy, Andrey and Khrabrov, Kuzma and Savchenko, Andrey and Kadurin, Artur and Tutubalina, Elena", booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2024", month = nov, year = "2024", address = "Miami, Florida, USA", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2024.findings-emnlp.760", pages = "12994--13013", }

Lost in Translation: Chemical LMs and the Misunderstanding of Molecule Structures

Abstract

AMORE is consistent with other textual metrics like ROUGE and METEOR

AMORE provides oppotunity to evaluate different architectures due to embeddings-based approach

Isomeric subset of AMORE is challenging

AMORE allows to visualize how robust to augmentations the model is.

AMORE shows that metrics on zero-shot tasks decrease due to augmentated molecules

Poster

BibTeX