Skip to content

DECIMER Image Transformer is a deep-learning-based tool designed for automated recognition of chemical structure images. Leveraging transformer architectures, the model converts chemical images into SMILES strings, enabling the digitization of chemical data from scanned documents, literature, and patents.

License

Notifications You must be signed in to change notification settings

Kohulan/DECIMER-Image_Transformer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿงช DECIMER Image Transformer ๐Ÿ–ผ๏ธ

Deep Learning for Chemical Image Recognition using Efficient-Net V2 + Transformer

DECIMER Logo

License Maintenance GitHub issues GitHub contributors tensorflow DOI Documentation Status GitHub release PyPI version fury.io


๐Ÿ“š Table of Contents


๐Ÿ”ฌ Abstract

The DECIMER 2.2 project tackles the OCSR (Optical Chemical Structure Recognition) challenge using cutting-edge computational intelligence methods. Our goal? To provide an automated, open-source software solution for chemical image recognition.

We've supercharged DECIMER with Google's TPU (Tensor Processing Unit) to handle datasets of over 1 million images with lightning speed!


๐Ÿง  Method and Model Changes

๐Ÿ–ผ๏ธ Image Feature Extraction

Now utilizing EfficientNet-V2 for superior image analysis

๐Ÿ”ฎ SMILES Prediction

Employing a state-of-the-art transformer model

๐Ÿš€ Training Enhancements

  1. TFRecord Files: Lightning-fast data reading
  2. Google Cloud Buckets: Efficient cloud storage solution
  3. TensorFlow Data Pipeline: Optimized data loading
  4. TPU Strategy: Harnessing the power of Google's TPUs

๐Ÿ’ป Installation

# Create a conda wonderland
conda create --name DECIMER python=3.10.0 -y
conda activate DECIMER

# Equip yourself with DECIMER
pip install decimer

๐ŸŽฎ Usage

from DECIMER import predict_SMILES

# Unleash the power of DECIMER
image_path = "path/to/your/chemical/masterpiece.jpg"
SMILES = predict_SMILES(image_path)
print(f"๐ŸŽ‰ Decoded SMILES: {SMILES}")

โœ๏ธ DECIMER - Hand-drawn Model

๐ŸŒŸ New Feature Alert! ๐ŸŒŸ

Our latest model brings the magic of AI to hand-drawn chemical structures!

DOI


๐Ÿ“œ Citation

If DECIMER helps your research, please cite:

  1. Rajan K, et al. "DECIMER.ai - An open platform for automated optical chemical structure identification, segmentation and recognition in scientific publications." Nat. Commun. 14, 5045 (2023).
  2. Rajan, K., et al. "DECIMER 1.0: deep learning for chemical image recognition using transformers." J Cheminform 13, 61 (2021).
  3. Rajan, K., et al. "Advancements in hand-drawn chemical structure recognition through an enhanced DECIMER architecture," J Cheminform 16, 78 (2024).

๐Ÿ™ Acknowledgements

  • A big thank you to Charles Tapley Hoyt for his invaluable contributions!
  • Powered by Google's TPU Research Cloud (TRC)


๐Ÿ‘จโ€๐Ÿ”ฌ Author: Kohulan


๐ŸŒ Project Website

Experience DECIMER in action at decimer.ai, brilliantly implemented by Otto Brinkhaus!


๐Ÿซ Research Group


๐Ÿ“Š Project Analytics

Repobeats

About

DECIMER Image Transformer is a deep-learning-based tool designed for automated recognition of chemical structure images. Leveraging transformer architectures, the model converts chemical images into SMILES strings, enabling the digitization of chemical data from scanned documents, literature, and patents.

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Languages