MolDet

File size: 2,958 Bytes

9d7bb26
 
bdcb38e
 
 
 
9d7bb26
 
 
 
99e0752
 
 
9d7bb26
 
 
2c9ddda
 
359523a
bdcb38e
9d7bb26
 
 
c1b9a7b
 
9d7bb26
 
84a195f
9d7bb26
c1b9a7b
9d7bb26
 
c8356c6
d29815b
c8356c6
 
 
 
9d7bb26
146dd97
 
 
 
 
 
9d7bb26
bdcb38e
9d7bb26
146dd97
9d7bb26
c1b9a7b
 
9d7bb26
84a195f
359523a
9d7bb26
c1b9a7b
 
 
9d7bb26
c8356c6
d29815b
c8356c6
 
 
 
146dd97
 
 
 
 
c1b9a7b
359523a
 
bdcb38e
99e0752
bdcb38e

---
license: mit
base_model:
- Ultralytics/YOLO11
tags:
- chemistry
---

# Molecule Detection YOLO in MolParser

From paper: "*MolParser: End-to-end Visual Recognition of Molecule Structures in the Wild*" (ICCV2025 Accept)

[Arxiv Paper](https://arxiv.org/abs/2411.11098) | [Huggingface Dataset](https://huggingface.co/datasets/AI4Industry/MolParser-7M) | [OCSR Demo](https://ocsr.dp.tech/)

We provide several [ultralytics YOLO11]((https://github.com/ultralytics/ultralytics)) weights for molecule detection with different size & input resolution.

![image/png](https://cdn-uploads.huggingface.co/production/uploads/65f7f16fb6941db5c2e7c4bf/7oWPoPxuEXSangDWnJ7mv.png)


## 1⃣️ [MolDet-General] General molecule structure detection models

`moldet_yolo11[size]_640_general.pt`

YOLO11 weights trained on 35k human annotated image crops and 100k generated images

* 640x640 input resolution
* support handwritten molecules
* **multiscale input** (inputs can be single/multiple molecular cutouts, reaction or table cutouts, or single-page PDF images)

<span style='color:gray'>Warning: For single-molecule input (used as a classification model), appropriate padding can be added to enhance the performance.</span>

Result in private testing:
| Model Size | mAP50 | mAP50-95 | Speed (T4 TensorRT10) |
| ---- | ----- | -------- | ----- |
| n | 0.9581 | 0.8524 | 1.5 ± 0.0 ms |
| s | 0.9652 | 0.8704 | 2.5 ± 0.1 ms |
| m | 0.9686 | 0.8736 | 4.7 ± 0.1 ms |
| l | **0.9891** | **0.9028** | 6.2 ± 0.1 ms |

usage:
```python
from ultralytics import YOLO
model = YOLO("moldet_yolo11l_640_general.pt")
model.predict("path/to/image.png", save=True, imgsz=640, conf=0.5)
```

## 2⃣️ [MolDet-Doc] PDF molecule structure detection models

`moldet_yolo11[size]_960_doc.pt`

YOLO11 weights trained on 26k human annotated PDF pages (patents, papers, and books)

* 960x960 input resolution
* prefer **single page PDF image** input
* better in small molecule detection

<span style='color:gray'>Warning: It is recommended to use MuPDF to render PDF pages at more than 144dpi.</span>


Result in private testing:
| Model Size | mAP50 | mAP50-95 | Speed (T4 TensorRT10) |
| ---- | ----- | -------- | ----- |
| n | 0.9871 | 0.8732 | 3.1 ± 0.0 ms |
| s | 0.9851 | 0.8824 | 5.5 ± 0.1 ms |
| m | 0.9867 | 0.8917 | 9.9 ± 0.2 ms |
| l | **0.9913** | **0.9011** | 13.1 ± 0.3 ms |

usage:
```python
from ultralytics import YOLO
model = YOLO("moldet_yolo11l_960_doc.pt")
model.predict("path/to/pdf_page_image.png", save=True, imgsz=960, conf=0.5)
```

## 📖 Citation

If you use this model in your work, please cite:

```
@article{fang2024molparser,
  title={Molparser: End-to-end visual recognition of molecule structures in the wild},
  author={Fang, Xi and Wang, Jiankun and Cai, Xiaochen and Chen, Shangqian and Yang, Shuwen and Tao, Haoyi and Wang, Nan and Yao, Lin and Zhang, Linfeng and Ke, Guolin},
  journal={arXiv preprint arXiv:2411.11098},
  year={2024}
}
```