Spaces:
Sleeping
Sleeping
File size: 13,958 Bytes
cb0e20d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 |
Metadata-Version: 2.4
Name: ultralytics
Version: 8.3.63
Summary: Ultralytics YOLO 🚀 for SOTA object detection, multi-object tracking, instance segmentation, pose estimation and image classification.
Author-email: Glenn Jocher <glenn.jocher@ultralytics.com>, Jing Qiu <jing.qiu@ultralytics.com>
Maintainer-email: Ultralytics <hello@ultralytics.com>
License: AGPL-3.0
Project-URL: Homepage, https://ultralytics.com
Project-URL: Source, https://github.com/ultralytics/ultralytics
Project-URL: Documentation, https://docs.ultralytics.com
Project-URL: Bug Reports, https://github.com/ultralytics/ultralytics/issues
Project-URL: Changelog, https://github.com/ultralytics/ultralytics/releases
Keywords: machine-learning,deep-learning,computer-vision,ML,DL,AI,YOLO,YOLOv3,YOLOv5,YOLOv8,YOLOv9,YOLOv10,YOLO11,HUB,Ultralytics
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Education
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: GNU Affero General Public License v3 or later (AGPLv3+)
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Image Recognition
Classifier: Operating System :: POSIX :: Linux
Classifier: Operating System :: MacOS
Classifier: Operating System :: Microsoft :: Windows
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.23.0
Requires-Dist: numpy<2.0.0; sys_platform == "darwin"
Requires-Dist: matplotlib>=3.3.0
Requires-Dist: opencv-python>=4.6.0
Requires-Dist: pillow>=7.1.2
Requires-Dist: pyyaml>=5.3.1
Requires-Dist: requests>=2.23.0
Requires-Dist: scipy>=1.4.1
Requires-Dist: torch>=1.8.0
Requires-Dist: torch!=2.4.0,>=1.8.0; sys_platform == "win32"
Requires-Dist: torchvision>=0.9.0
Requires-Dist: tqdm>=4.64.0
Requires-Dist: psutil
Requires-Dist: py-cpuinfo
Requires-Dist: pandas>=1.1.4
Requires-Dist: seaborn>=0.11.0
Requires-Dist: ultralytics-thop>=2.0.0
Provides-Extra: dev
Requires-Dist: ipython; extra == "dev"
Requires-Dist: pytest; extra == "dev"
Requires-Dist: pytest-cov; extra == "dev"
Requires-Dist: coverage[toml]; extra == "dev"
Requires-Dist: mkdocs>=1.6.0; extra == "dev"
Requires-Dist: mkdocs-material>=9.5.9; extra == "dev"
Requires-Dist: mkdocstrings[python]; extra == "dev"
Requires-Dist: mkdocs-redirects; extra == "dev"
Requires-Dist: mkdocs-ultralytics-plugin>=0.1.8; extra == "dev"
Requires-Dist: mkdocs-macros-plugin>=1.0.5; extra == "dev"
Provides-Extra: export
Requires-Dist: onnx>=1.12.0; extra == "export"
Requires-Dist: coremltools>=7.0; (platform_system != "Windows" and python_version <= "3.11") and extra == "export"
Requires-Dist: scikit-learn>=1.3.2; (platform_system != "Windows" and python_version <= "3.11") and extra == "export"
Requires-Dist: openvino>=2024.0.0; extra == "export"
Requires-Dist: tensorflow>=2.0.0; extra == "export"
Requires-Dist: tensorflowjs>=3.9.0; extra == "export"
Requires-Dist: tensorstore>=0.1.63; (platform_machine == "aarch64" and python_version >= "3.9") and extra == "export"
Requires-Dist: keras; extra == "export"
Requires-Dist: flatbuffers<100,>=23.5.26; platform_machine == "aarch64" and extra == "export"
Requires-Dist: numpy==1.23.5; platform_machine == "aarch64" and extra == "export"
Requires-Dist: h5py!=3.11.0; platform_machine == "aarch64" and extra == "export"
Provides-Extra: solutions
Requires-Dist: shapely>=2.0.0; extra == "solutions"
Requires-Dist: streamlit; extra == "solutions"
Provides-Extra: logging
Requires-Dist: comet; extra == "logging"
Requires-Dist: tensorboard>=2.13.0; extra == "logging"
Requires-Dist: dvclive>=2.12.0; extra == "logging"
Provides-Extra: extra
Requires-Dist: hub-sdk>=0.0.12; extra == "extra"
Requires-Dist: ipython; extra == "extra"
Requires-Dist: albumentations>=1.4.6; extra == "extra"
Requires-Dist: pycocotools>=2.0.7; extra == "extra"
Dynamic: license-file
<div align="center">
<h1>YOLOv12</h1>
<h3>YOLOv12: Attention-Centric Real-Time Object Detectors</h3>
[Yunjie Tian](https://sunsmarterjie.github.io/)<sup>1</sup>, [Qixiang Ye](https://people.ucas.ac.cn/~qxye?language=en)<sup>2</sup>, [David Doermann](https://cse.buffalo.edu/~doermann/)<sup>1</sup>
<sup>1</sup> University at Buffalo, SUNY, <sup>2</sup> University of Chinese Academy of Sciences.
<p align="center">
<img src="assets/tradeoff_turbo.svg" width=90%> <br>
Comparison with popular methods in terms of latency-accuracy (left) and FLOPs-accuracy (right) trade-offs
</p>
</div>
[](https://arxiv.org/abs/2502.12524) [](https://huggingface.co/spaces/sunsmarterjieleaf/yolov12) <a href="https://colab.research.google.com/github/roboflow-ai/notebooks/blob/main/notebooks/train-yolov12-object-detection-model.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a> [](https://www.kaggle.com/code/jxxn03x/yolov12-on-custom-data) [](https://blog.roboflow.com/use-yolov12-with-roboflow/#deploy-yolov12-models-with-roboflow) [](https://openbayes.com/console/public/tutorials/A4ac4xNrUCQ)
## Updates
- 2025/03/18: Some guys are interested in the heatmap. See this [issue](https://github.com/sunsmarterjie/yolov12/issues/74).
- 2025/03/09: **YOLOv12-turbo** is released: a faster YOLOv12 version.
- 2025/02/24: Blogs: [ultralytics](https://docs.ultralytics.com/models/yolo12/), [LearnOpenCV](https://learnopencv.com/yolov12/). Thanks to them!
- 2025/02/22: [YOLOv12 TensorRT CPP Inference Repo + Google Colab Notebook](https://github.com/mohamedsamirx/YOLOv12-TensorRT-CPP).
- 2025/02/22: [Android deploy](https://github.com/mpj1234/ncnn-yolov12-android/tree/main) / [TensorRT-YOLO](https://github.com/laugh12321/TensorRT-YOLO) accelerates yolo12. Thanks to them!
- 2025/02/21: Try yolo12 for classification, oriented bounding boxes, pose estimation, and instance segmentation at [ultralytics](https://github.com/ultralytics/ultralytics/tree/main/ultralytics/cfg/models/12). Please pay attention to this [issue](https://github.com/sunsmarterjie/yolov12/issues/29). Thanks to them!
- 2025/02/20: [Any computer or edge device?](https://github.com/roboflow/inference) / [ONNX CPP Version](https://github.com/mohamedsamirx/YOLOv12-ONNX-CPP). Thanks to them!
- 2025/02/20: Train a yolov12 model on a custom dataset: [Blog](https://blog.roboflow.com/train-yolov12-model/) and [Youtube](https://www.youtube.com/watch?v=fksJmIMIfXo). / [Step-by-step instruction](https://youtu.be/dO8k5rgXG0M). Thanks to them!
- 2025/02/19: [arXiv version](https://arxiv.org/abs/2502.12524) is public. [Demo](https://huggingface.co/spaces/sunsmarterjieleaf/yolov12) is available (try [Demo2](https://huggingface.co/spaces/sunsmarterjieleaf/yolov12_demo2) [Demo3](https://huggingface.co/spaces/sunsmarterjieleaf/yolov12_demo3) if busy).
<details>
<summary>
<font size="+1">Abstract</font>
</summary>
Enhancing the network architecture of the YOLO framework has been crucial for a long time but has focused on CNN-based improvements despite the proven superiority of attention mechanisms in modeling capabilities. This is because attention-based models cannot match the speed of CNN-based models. This paper proposes an attention-centric YOLO framework, namely YOLOv12, that matches the speed of previous CNN-based ones while harnessing the performance benefits of attention mechanisms.
YOLOv12 surpasses all popular real-time object detectors in accuracy with competitive speed. For example, YOLOv12-N achieves 40.6% mAP with an inference latency of 1.64 ms on a T4 GPU, outperforming advanced YOLOv10-N / YOLOv11-N by 2.1%/1.2% mAP with a comparable speed. This advantage extends to other model scales. YOLOv12 also surpasses end-to-end real-time detectors that improve DETR, such as RT-DETR / RT-DETRv2: YOLOv12-S beats RT-DETR-R18 / RT-DETRv2-R18 while running 42% faster, using only 36% of the computation and 45% of the parameters.
</details>
## Main Results
**Turbo (default version)**:
| Model | size<br><sup>(pixels) | mAP<sup>val<br>50-95 | Speed<br><sup>T4 TensorRT10<br> | params<br><sup>(M) | FLOPs<br><sup>(G) |
| :----------------------------------------------------------------------------------- | :-------------------: | :-------------------:| :------------------------------:| :-----------------:| :---------------:|
| [YOLO12n](https://github.com/sunsmarterjie/yolov12/releases/download/turbo/yolov12n.pt) | 640 | 40.4 | 1.60 | 2.5 | 6.0 |
| [YOLO12s](https://github.com/sunsmarterjie/yolov12/releases/download/turbo/yolov12s.pt) | 640 | 47.6 | 2.42 | 9.1 | 19.4 |
| [YOLO12m](https://github.com/sunsmarterjie/yolov12/releases/download/turbo/yolov12m.pt) | 640 | 52.5 | 4.27 | 19.6 | 59.8 |
| [YOLO12l](https://github.com/sunsmarterjie/yolov12/releases/download/turbo/yolov12l.pt) | 640 | 53.8 | 5.83 | 26.5 | 82.4 |
| [YOLO12x](https://github.com/sunsmarterjie/yolov12/releases/download/turbo/yolov12x.pt) | 640 | 55.4 | 10.38 | 59.3 | 184.6 |
[**v1.0**](https://github.com/sunsmarterjie/yolov12/tree/V1.0):
| Model | size<br><sup>(pixels) | mAP<sup>val<br>50-95 | Speed<br><sup>T4 TensorRT10<br> | params<br><sup>(M) | FLOPs<br><sup>(G) |
| :----------------------------------------------------------------------------------- | :-------------------: | :-------------------:| :------------------------------:| :-----------------:| :---------------:|
| [YOLO12n](https://github.com/sunsmarterjie/yolov12/releases/download/v1.0/yolov12n.pt) | 640 | 40.6 | 1.64 | 2.6 | 6.5 |
| [YOLO12s](https://github.com/sunsmarterjie/yolov12/releases/download/v1.0/yolov12s.pt) | 640 | 48.0 | 2.61 | 9.3 | 21.4 |
| [YOLO12m](https://github.com/sunsmarterjie/yolov12/releases/download/v1.0/yolov12m.pt) | 640 | 52.5 | 4.86 | 20.2 | 67.5 |
| [YOLO12l](https://github.com/sunsmarterjie/yolov12/releases/download/v1.0/yolov12l.pt) | 640 | 53.7 | 6.77 | 26.4 | 88.9 |
| [YOLO12x](https://github.com/sunsmarterjie/yolov12/releases/download/v1.0/yolov12x.pt) | 640 | 55.2 | 11.79 | 59.1 | 199.0 |
## Installation
```
wget https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.3/flash_attn-2.7.3+cu11torch2.2cxx11abiFALSE-cp311-cp311-linux_x86_64.whl
conda create -n yolov12 python=3.11
conda activate yolov12
pip install -r requirements.txt
pip install -e .
```
## Validation
[`yolov12n`](https://github.com/sunsmarterjie/yolov12/releases/download/turbo/yolov12n.pt)
[`yolov12s`](https://github.com/sunsmarterjie/yolov12/releases/download/turbo/yolov12s.pt)
[`yolov12m`](https://github.com/sunsmarterjie/yolov12/releases/download/turbo/yolov12m.pt)
[`yolov12l`](https://github.com/sunsmarterjie/yolov12/releases/download/turbo/yolov12l.pt)
[`yolov12x`](https://github.com/sunsmarterjie/yolov12/releases/download/turbo/yolov12x.pt)
```python
from ultralytics import YOLO
model = YOLO('yolov12{n/s/m/l/x}.pt')
model.val(data='coco.yaml', save_json=True)
```
## Training
```python
from ultralytics import YOLO
model = YOLO('yolov12n.yaml')
# Train the model
results = model.train(
data='coco.yaml',
epochs=600,
batch=256,
imgsz=640,
scale=0.5, # S:0.9; M:0.9; L:0.9; X:0.9
mosaic=1.0,
mixup=0.0, # S:0.05; M:0.15; L:0.15; X:0.2
copy_paste=0.1, # S:0.15; M:0.4; L:0.5; X:0.6
device="0,1,2,3",
)
# Evaluate model performance on the validation set
metrics = model.val()
# Perform object detection on an image
results = model("path/to/image.jpg")
results[0].show()
```
## Prediction
```python
from ultralytics import YOLO
model = YOLO('yolov12{n/s/m/l/x}.pt')
model.predict()
```
## Export
```python
from ultralytics import YOLO
model = YOLO('yolov12{n/s/m/l/x}.pt')
model.export(format="engine", half=True) # or format="onnx"
```
## Demo
```
python app.py
# Please visit http://127.0.0.1:7860
```
## Acknowledgement
The code is based on [ultralytics](https://github.com/ultralytics/ultralytics). Thanks for their excellent work!
## Citation
```BibTeX
@article{tian2025yolov12,
title={YOLOv12: Attention-Centric Real-Time Object Detectors},
author={Tian, Yunjie and Ye, Qixiang and Doermann, David},
journal={arXiv preprint arXiv:2502.12524},
year={2025}
}
```
|