Spaces:
Sleeping
Sleeping
license: mit | |
title: Beta-lactam Generator | |
sdk: streamlit | |
emoji: π | |
colorFrom: blue | |
colorTo: gray | |
short_description: App to generate and view beta-lactam structures | |
# Beta-Lactam Molecule Generator and Viewer | |
## Overview | |
This application demonstrates a drug discovery pipeline that allows users to: | |
* Generate novel beta-lactam molecules using a generative AI model that was fine-tuned with beta-lactam structures. | |
* View the generated molecules with SMILES and SAFE strings. | |
* Predict select ADMET properties for the generated molecules using ADMET-AI. | |
## Features | |
* **Molecule Generation**: | |
* Generates up to 3 beta-lactam molecules at a time. | |
* Users can adjust the creativity (temperature) of the generation process. Higher number leads to more diverse output. | |
* Generated molecules are named 'Mol01' to 'Mol03'. | |
* **Molecule Viewing**: | |
* Displays molecule structures using Streamlit. | |
* View molecules as SMILES and SAFE encodings. | |
* **ADMET Property Prediction**: | |
* Integrates ADMET-AI to predict select properties. | |
* Displays predicted properties of each molecule. | |
## How to Use the App | |
1. Set Generation Parameters: | |
* Use the sidebar to adjust the creativity (temperature) slider. | |
* Select the number of molecules to generate (maximum of 3). | |
2. Generate Molecules: | |
* Click the 'Generate Molecules' button. | |
* Generated molecules will appear with their structures, strings, and predicted ADMET properties. | |
## Technical Details | |
* **Generative Model**: Uses the model: 'seyonec/PubChem10M_SMILES_BPE_450k' fine-tuned on beta-lactam structures. | |
* **ADMET Predictions**: Uses the ADMET-AI library to predict molecular properties. | |
* **Visualization**: Employs RDKit and SAFE encoding for molecule rendering. | |
* **Frameworks and Libraries**: | |
* **Streamlit** for the web interface. | |
* **Transformers** library for model loading and generation. | |
* **RDKit** for cheminformatics. | |
### The application is intended for demonstration purposes only. | |
## License | |
This project is licensed under the terms of the MIT license. | |
## Attributions and Acknowledgments | |
### ChEMBL Database: | |
This project utilizes data from the ChEMBL Database, licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). | |
Zdrazil B, Felix E, Hunter F, et al. The ChEMBL Database in 2023: a drug discovery platform spanning multiple bioactivity data types and time periods. Nucleic Acids Research. 2024;52(D1) | |
. doi:10.1093/nar/gkad1004 | |
https://www.ebi.ac.uk/chembl/ | |
### SAFE Encoding: | |
This project uses the SAFE Encoding framework, licensed under the Apache License 2.0. | |
Noutahi E, Gabellini C, Craig M, Lim JS, Tossou P. Gotta be SAFE: A New Framework for Molecular Design. arXiv preprint arXiv:2310.10773, 2023. | |
https://github.com/datamol-io/safe | |
### This project utilizes the ADMET-AI platform for predicting ADMET properties: | |
Swanson K, Walther P, Leitz J, et al. ADMET-AI: A machine learning ADMET platform for evaluation of large-scale chemical libraries. bioRxiv. 2023. doi:10.1101/2023.12.28.573531 | |
https://admet.ai.greenstonebio.com/ | |
### This project uses RDKit: | |
RDKit: Open-source cheminformatics. https://www.rdkit.org |